I have 2 Server 2003 boxes that are running MSCS and serving as a SQL failover cluster. Recently SQL crashed on one node, but I cannot figure out why. I looked at the errorlog as well as the generated MDMP, but some of this is outside my comfort zone. Here is the crash dump analysis:
**********
Microsoft (R) Windows Debugger Version 6.2.9200.16384 X86
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [C:\tsdrop\SQL dump\SQLDump0002.mdmp]
Comment: 'Stack Trace'
Comment: 'ex_terminator - Last chance exception handling'
User Mini Dump File: Only registers, stack and portions of memory are available
Symbol search path is: SRV*C:\symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows Server 2003 Version 3790 (Service Pack 2) MP (7 procs) Free x86 compatible
Product: Server, suite: Enterprise TerminalServer SingleUserTS
Machine Name:
Debug session time: Fri Jan 4 11:49:54.000 2013 (UTC - 5:00)
System Uptime: not available
Process Uptime: 15 days 0:10:55.000
................................................................
................
Loading unloaded module list
.........
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(1e88.1ee8): Unknown exception - code 000042ac (first/second chance not available)
eax=0000007c ebx=00000000 ecx=00000000 edx=001d0688 esi=0000114c edi=00000000
eip=7c82845c esp=02e9d880 ebp=02e9d8f0 iopl=0 nv up ei ng nz ac pe cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000297
ntdll!KiFastSystemCallRet:
7c82845c c3 ret
**********
This didn't give me anything super obvious, so I tried .excr:
**********
0:000> .ecxr
*** WARNING: Unable to verify timestamp for sqlservr.exe
eax=02e9de08 ebx=00000440 ecx=02a3f7fc edx=02a3f7fc esi=00000000 edi=027fab38
eip=77e4bef7 esp=02e9de04 ebp=02e9de58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
kernel32!RaiseException+0x53:
77e4bef7 5e pop esi
**********
Then I tried !analyze:
**********
0:000> !analyze
*******************************************************************************
*
*
* Exception Analysis
*
*
*
*******************************************************************************
Use !analyze -v to get detailed debugging information.
*** WARNING: Unable to verify timestamp for msvcr80.dll
Probably caused by : sqlservr.exe ( sqlservr!CVariableInfo::CVarBlock::PvbJoin+5c )
Followup: MachineOwner
---------
**********
And finally, !analyze -v:
**********
0:000> !analyze -v
*******************************************************************************
*
*
* Exception Analysis
*
*
*
*******************************************************************************
FAULTING_IP:
sqlservr!CVariableInfo::CVarBlock::PvbJoin+5c
01004bc6 ?? ???
EXCEPTION_RECORD: 02e9efb0 -- (.exr 0x2e9efb0)
ExceptionAddress: 01004bc6 (sqlservr!CVariableInfo::CVarBlock::PvbJoin+0x0000005c)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000001
Parameter[1]: b18316c8
Attempt to write to address b18316c8
DEFAULT_BUCKET_ID: APPLICATION_FAULT
PROCESS_NAME: sqlservr.exe
ERROR_CODE: (NTSTATUS) 0x42ac - <Unable to get error code text>
EXCEPTION_CODE: (Win32) 0x42ac (17068) - <Unable to get error code text>
NTGLOBALFLAG: 0
APP: sqlservr.exe
CONTEXT: 02e9efcc -- (.cxr 0x2e9efcc)
eax=330e1e18 ebx=00000000 ecx=330e1c18 edx=b18316c0 esi=b1a6fc3e edi=330e0000
eip=01004bc6 esp=02e9f298 ebp=02e9f2ac iopl=0 nv up ei ng nz ac pe cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010297
sqlservr!CVariableInfo::CVarBlock::PvbJoin+0x5c:
01004bc6 ?? ???
Resetting default scope
WRITE_ADDRESS: b18316c8
FOLLOWUP_IP:
sqlservr!CVariableInfo::CVarBlock::PvbJoin+5c
01004bc6 ?? ???
FAILED_INSTRUCTION_ADDRESS:
sqlservr!CVariableInfo::CVarBlock::PvbJoin+5c
01004bc6 ?? ???
IP_ON_HEAP: 330e1e18
FRAME_ONE_INVALID: 1
LAST_CONTROL_TRANSFER: from 330e1e18 to 01004bc6
FAULTING_THREAD: ffffffff
PRIMARY_PROBLEM_CLASS: APPLICATION_FAULT
BUGCHECK_STR: APPLICATION_FAULT_APPLICATION_FAULT
STACK_TEXT:
02e9f298 01004bc6 sqlservr!CVariableInfo::CVarBlock::PvbJoin+0x5c
02e9f29c 330e1e18 unknown!unknown+0x0
02e9f2b4 0100450b sqlservr!CVarPageMgr::Release+0x16
02e9f2b8 330e1e20 unknown!unknown+0x0
02e9f2d0 01008988 sqlservr!CMemThread::Free+0x4f
02e9f2f0 0121daf4 sqlservr!CSqlHashBkt::PurgeDeadSql+0xf2
02e9f30c 0121da38 sqlservr!CSqlHashBkt::ReleaseSql+0x64
02e9f32c 01234205 sqlservr!CCompPlan::~CCompPlan+0xbd
02e9f39c 01234363 sqlservr!CCompPlan::`scalar deleting destructor'+0xd
02e9f3a8 010b048b sqlservr!CCacheObject::Release+0x36
02e9f3b8 01248827 sqlservr!CCacheObject::Destroy+0x30
02e9f3c8 01248995 sqlservr!SOS_CacheStore::RemoveDescriptor+0x1fd
02e9f420 012489de sqlservr!SOS_CacheStore::CacheEntryDescriptor::Destroy+0x40
02e9f440 024205e4 sqlservr!ClockHand::Move+0x38c
02e9f570 02420b6b sqlservr!ClockAlgorithm::MoveHand+0x43
02e9f594 02420a76 sqlservr!ClockAlgorithm::ProcessTick+0x149
02e9f5ec 0241f945 sqlservr!SOS_CacheStore::Notify+0x3f
02e9f604 0242536a sqlservr!ResourceMonitor::NotifyMemoryConsumers+0x2a4
02e9f6c8 02424f44 sqlservr!ResourceMonitor::ResourceMonitorTask+0x190
02e9f76c 021a9ef8 sqlservr!SetupResourceMonitorTaskContext+0x44e
02e9fd54 010067d3 sqlservr!SOS_Task::Param::Execute+0xe2
02e9fdc4 010068f9 sqlservr!SOS_Scheduler::RunTask+0xb9
02e9fdf8 01006609 sqlservr!SOS_Scheduler::ProcessTasks+0x141
02e9fe38 010daf6c sqlservr!SchedulerManager::WorkerEntryPoint+0x1ad
02e9fea0 010dae8c sqlservr!SystemThread::RunWorker+0x7f
02e9feb8 010dab54 sqlservr!SystemThreadDispatcher::ProcessWorker+0x246
02e9ff18 010dacf1 sqlservr!SchedulerManager::ThreadEntryPoint+0x143
02e9ff80 781329bb msvcr80!_callthreadstartex+0x1b
02e9ffb8 78132a47 msvcr80!_threadstartex+0x66
02e9ffc0 77e6482f kernel32!BaseThreadStart+0x34
STACK_COMMAND: .cxr 02E9EFCC ; kb ; dps 2e9f298 ; kb
SYMBOL_STACK_INDEX: 0
SYMBOL_NAME: sqlservr!CVariableInfo::CVarBlock::PvbJoin+5c
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: sqlservr
IMAGE_NAME: sqlservr.exe
DEBUG_FLR_IMAGE_TIMESTAMP: 492b676e
FAILURE_BUCKET_ID: APPLICATION_FAULT_42ac_sqlservr.exe!CVariableInfo::CVarBlock::PvbJoin
BUCKET_ID: APPLICATION_FAULT_APPLICATION_FAULT_BAD_IP_sqlservr!CVariableInfo::CVarBlock::PvbJoin+5c
WATSON_STAGEONE_URL: http://watson.microsoft.com/StageOne/sqlservr_exe/2005_90_4035_0/492b676e/sqlservr_exe/2005_90_4035_0/492b676e/42ac/011bbce0.htm?Retriage=1
Followup: MachineOwner
---------
**********
I've analyzed Windows STOP errors (the DMP files that end up in the Minidump folder), but that's really all I have used Windows Debugger for. I've never tried to analyze a SQL crash as I haven't had this problem before. Any help would be much appreciated.
Sr System Engineer | Vision One IT Consulting | www.v1corp.com