Perfmon looks happy, but I get the impression that when my server does physical IO performance suffers even more than I'd expect.
I'm in a new shop now where all SQL Servers are running in VMs on different physical servers. The apps are modest, the databases modest. Buffer hit ratio stays around 100, PLE is usually sky-high but sometimes dips towards 300.
Yet, some queries run really slowly, say 30 seconds, the first time, presumably doing some physical IO, and then speed up to 3 seconds if rerun immediately, but if I wait even a few minutes they're slow again. It doesn't seem to be system load, there's hardly ever more than one user doing an active transaction at any moment, and very little blocking (oh, a little, but we're working on it!).
I'm still suspicious anytime I'm in a VM, so the question is, can running in a VM impose real overhead that is just plain invisible, from inside of Windows and SQL Server, except by duration? Any other counters or indicators to look at?
So far I haven't got any feedback from the hypervisor/san/platform guys.
Thanks.
Josh