Monday, February 15, 2010

Crash in unloaded module

Recently we got a customer report, that our application sometimes crashes short after windows starts up.
After analyzing the crash dump file by using the tool WinDbg, we find out the crash is in a 3rd party dll.

Following is the output from WinDbg, where the Channel.dll is the 3rd party dll.

=====================================================================
0:019> !analyze -v

FAULTING_IP:
CHANNEL+9d2f
017a9d2f 0000 add byte ptr [eax],al

EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 017a9d2f (+0x00009d2f)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000000
Parameter[1]: 017a9d2f
Attempt to read from address 017a9d2f

DEFAULT_BUCKET_ID: WRONG_SYMBOLS

PROCESS_NAME: MyApp.exe

MODULE_NAME: CHANNEL

FAULTING_MODULE: 7c900000 ntdll

DEBUG_FLR_IMAGE_TIMESTAMP: 48a9711f

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

READ_ADDRESS: 017a9d2f

BUGCHECK_STR: ACCESS_VIOLATION

LAST_CONTROL_TRANSFER: from 00000000 to 00000000

STACK_TEXT:
0018a168 00000000 00000000 00000000 00000000 0x0


FAULTING_THREAD: 000006c4

FAILED_INSTRUCTION_ADDRESS:
CHANNEL+9d2f
017a9d2f 0000 add byte ptr [eax],al

FOLLOWUP_IP:
CHANNEL+9d2f
017a9d2f 0000 add byte ptr [eax],al

SYMBOL_NAME: CHANNEL+9d2f

FOLLOWUP_NAME: MachineOwner
=====================================================================

Up to now, we know that the crash happens in address 0x017a9d2f which is in an unloaded dll area.
Okay, let's have a look at the module loading list, wupps, the module is shifted from address 0x017a000~0x017c6000 to 0x02fb0000~0x02fd6000.

Then I debug our application and find out that,
1. MyApp.exe tries to access the channel by calling the function Channel_open() in channel.dll, with a timeout of 2s. If failed, it closes the channel, and tries again.
2. the dll is loaded to the process after the call Channel_open and unloaded after Channel_close, (The dll is linked as load-time DLL. For more information about load-time DLL, please see to msdn http://msdn.microsoft.com/en-us/library/ms684184(VS.85).aspx.)
3. three threads start to run inside the dll after loading,
4. one thread exits immediately after unloading the dll (by calling function Channel_close), then the other two exits after a short time slice,
5. the process loads the dll again by calling Channel_open. The dll might be loaded in the same address as before, or somewhere else.

Then we got the reason for the crash, at least one of the dll's threads doesn't exit successfully after the dll is unloaded. This happens especcially when
the system is busy. Because the time slice that the system assigns to the thread isn't longer enough for the execution of the thread's exiting code.

No comments:

Post a Comment