I understand that processor affinity controls how many cpus SQL can use, and I know that maxdop tells it how many it can use in a parallel query execution. So if I set maxdop = 2 (affinity uses all) at the instance level and then run a particular query, I do see CXPACKET waits. During this timeframe the number of threads spawned on the box is around 48 for the query. There are 4 physical cpus in this box w/ 16 cores per cpu. it has 4 numa nodes on the box.
I want to make sure I am looking at this correctly. Does this mean that I am seeing where different portions of the query think they want to parallel execute so it would spawn 2 threads per batch or physical operation?
This makes me believe that setting maxdop = 2 inside the query itself should still exhibit the same behavior if I set the instance level maxdop back to the original setting of 6. I was wondering if nolock hints or index builds/update stats on a temporary result set (in a temp table) have any additional effect on this. I'm planning to test it a few different ways, but was wondering if there is any additional guidance on this behavior. I was more curious as to how I can find out why it decided that 48 threads was what it needed to do in this instance. when this occurs, I can see some of the schedulers go into a non-yielding state (SOS_xxxx) but the overall cpu utilization on the box is < 3%.
Does anyone have some info I can use to dig deeper into this issue?
Thanks,
-Ryan