Published by OpenTask, Republic of Ireland
Copyright © 2009 by OpenTask
Memory Dump and Trace Analysis: A Unified Pattern Approach © Dmitry Vostokov
Artifacts of Inline User Mode Heap Analysis © Aditya K Sood
Colorimetric Tracing: A Visual Approach to Tracking Function Calls © Dmitry Vostokov
JIT Stack Trace Script and Library © Thomas Monahan and Dmitry Vostokov
All other articles in this issue © Dmitry Vostokov
All rights reserved.
Debugged! MZ/PE: MagaZine for/from Practicing Engineers
Volume 1, Issue 3, September 2009
Memory Dump and Trace Analysis: A Unified Pattern Approach
Dmitry Vostokov, 25th September 2009 (revised
10th November 2009)
http://www.dumpanalysis.org
|
O |
nly memory dump analysis (static troubleshooting and postmortem debugging) or only software trace analysis (dynamic troubleshooting and postmortem debugging) is not always sufficient to resolve customer problems. Sometimes the combination of both methods provides the best result as the following synthetic case study shows.
The statistical application was consuming 50% of CPU time on a 2 processor machine (Spiking Thread pattern[1]). The forced user memory dump revealed normal stack traces (Stack Trace Collection pattern[2], only main thread stack trace is shown here):
0:000> ~*kL
. 0 Id: 17c0.17c4 Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr
WARNING: Stack unwind information not available. Following frames may be wrong.
0012faf8 004788ac ntdll!KiFastSystemCallRet
0012fb70 00905a2c Application+0x788ac
0012fb90 00905d94 Application+0x505a2c
0012fb98 00458bb2 Application+0x505d94
0012fbbc 00458f9d Application+0x58bb2
0012fd24 004587e7 Application+0x58f9d
0012fd40 0045b8d9 Application+0x587e7
0012fdb4 0042a4a6 Application+0x5b8d9
0012fdcc 7739b6e3 Application+0x2a4a6
0012fdf8 7739b874 user32!InternalCallWinProc+0x28
0012fe70 7739ba92 user32!UserCallWinProcCheckWow+0x151
0012fed8 773a16e5 user32!DispatchMessageWorker+0x327
0012fee8 00478868 user32!DispatchMessageA+0xf
0012ff54 009cf0f6 Application+0x78868
0012ffc0 77e6f23b Application+0x5cf0f6
0012fff0 00000000 kernel32!BaseProcessStart+0x23
[…]
Unfortunately thread times statistics didn’t show which thread was spiking:
0:000> !runaway f
User Mode Time
Thread Time
0:17c4 0 days 0:00:06.968
1:17d8 0 days 0:00:00.031
2:17e8 0 days 0:00:00.015
6:1090 0 days 0:00:00.000
5:dc0 0 days 0:00:00.000
4:588 0 days 0:00:00.000
3:17fc 0 days 0:00:00.000
Kernel Mode Time
Thread Time
0:17c4 0 days 0:00:02.562
2:17e8 0 days 0:00:00.031
6:1090 0 days 0:00:00.000
5:dc0 0 days 0:00:00.000
4:588 0 days 0:00:00.000
3:17fc 0 days 0:00:00.000
1:17d8 0 days 0:00:00.000
Elapsed Time
Thread Time
0:17c4 0 days 0:14:16.454
1:17d8 0 days 0:14:16.266
2:17e8 0 days 0:14:15.704
3:17fc 0 days 0:14:11.204
4:588 0 days 0:14:09.876
5:dc0 0 days 0:13:59.297
6:1090 0 days 0:11:11.202
We requested an ETW trace (Event Trace for Windows) for all major components to see the dynamics of the system. The trace had a sudden gap of 5 minutes without any tracing activity (Discontinuity pattern[3]). Before that there was a significant software hook activity for 40 seconds visible from the start of the tracing so it could have been happening earlier and then after the silent gap there was another hook activity till the end of the tracing (about 10 seconds). We didn't know whether that sudden gap was during the time when the application did its calculations or it was done before or after that. This is an example of the repeated activity (Periodic Error[4] and Characteristic Message Block patterns[5]):
# Module PID TID Time Message
[…]
6870 HookA 2592 5476 08:24:34.437 Attempt to open CLSID\{ … } in User hive
6871 HookA 2592 5476 08:24:34.437 User hive open failed: 2
6872 HookA 2592 5476 08:24:34.437 Attempt to open of Software\Classes\CLSID\{ … } in System hive
6873 HookA 2592 5476 08:24:34.437 Hooked CoCreateInstanceEx: Calling original CoCreateInstanceEx with original CLSID
[…]
Unfortunately not all possible components were selected for tracing, some common ones were obviously missing and the new full trace together with a simultaneous full user dump forced within the first minutes of CPU spiking were requested.
The new memory dump revealed a new thread that was missing in the first dump (Missing Thread pattern[6]):
7 Id: 17c0.1238 Suspend: 1 Teb: 7ffd5000 Unfrozen
*** ERROR: Symbol file could not be found. Defaulted to export symbols for HookB.dll -
ChildEBP RetAddr
WARNING: Stack unwind information not available. Following frames may be wrong.
0461f738 658d1e58 ntdll!KiFastSystemCallRet
0461fe20 658d22c9 HookB+0x1e58
0461fe4c 0040cadd HookB+0x22c9
0461fe84 005c9467 Application+0xcadd
0461ff04 005ff7f5 Application+0x1c9467
0461ff44 0066a5f1 Application+0x1ff7f5
0461ff74 00428f7b Application+0x26a5f1
0461ffa4 00404f02 Application+0x28f7b
0461ffb8 77e6482f Application+0x4f02
0461ffec 00000000 kernel32!BaseThreadStart+0x34
Thread times statistics revealed that it was the thread with the most spent time in user and kernel modes:
0:000> !runaway f
User Mode Time
Thread Time
7:1238 0 days 0:00:07.687
0:17c4 0 days 0:00:06.062
1:17d8 0 days 0:00:00.031
6:1090 0 days 0:00:00.000
5:dc0 0 days 0:00:00.000
4:588 0 days 0:00:00.000
3:17fc 0 days 0:00:00.000
2:17e8 0 days 0:00:00.000
Kernel Mode Time
Thread Time
7:1238 0 days 0:01:03.437
0:17c4 0 days 0:00:01.609
2:17e8 0 days 0:00:00.015
6:1090 0 days 0:00:00.000
5:dc0 0 days 0:00:00.000
4:588 0 days 0:00:00.000
3:17fc 0 days 0:00:00.000
1:17d8 0 days 0:00:00.000
Elapsed Time
Thread Time
0:17c4 0 days 0:05:40.454
1:17d8 0 days 0:05:40.266
2:17e8 0 days 0:05:39.704
3:17fc 0 days 0:05:35.204
4:588 0 days 0:05:33.876
5:dc0 0 days 0:05:23.297
6:1090 0 days 0:02:35.202
7:1238 0 days 0:01:13.936
Using symbol files from the vendor we got the more refined stack trace:
7 Id: 17c0.1238 Suspend: 1 Teb: 7ffd5000 Unfrozen
ChildEBP RetAddr Args to Child
0461f6f4 7c827369 7c839cfe 0461f778 02000000 ntdll!KiFastSystemCallRet
0461f6f8 7c839cfe 0461f778 02000000 0461f714 ntdll!ZwOpenKey+0xc
0461f738 658d1e58 02000000 0461f778 00000000 ntdll!RtlOpenCurrentUser+0x44
0461fe20 658d22c9 658d5018 658d5314 00000000 HookB!GetTimeZone+0xf8
0461fe4c 0040cadd 0461fe60 005fb5be 040fc3e4 HookB!GetLocalTime+0x89
WARNING: Stack unwind information not available. Following frames may be wrong.
0461fe84 005c9467 0461fe98 005c9661 0461ff04 Application+0xcadd
0461ff04 005ff7f5 0461ff4c 005ff810 0461ff44 Application+0x1c9467
0461ff44 0066a5f1 0461ff58 0066a5fb 0461ff74 Application+0x1ff7f5
0461ff74 00428f7b 0461ff88 00428f85 0461ffa4 Application+0x26a5f1
0461ffa4 00404f02 0461ffdc 00404a80 0461ffb8 Application+0x28f7b
0461ffb8 77e6482f 03a19418 00000000 00000000 Application+0x4f02
0461ffec 00000000 00404ed8 03a19418 00000000 kernel32!BaseThreadStart+0x34
The ETW trace confirmed HookB module involvement. We had massive HookB trace statement current of almost 30,000 msg/s and almost 99% trace statement density (Statement Density and Current pattern[7]):
# Module PID TID Time Message
[…]
555398 HookB 4276 744 13:09:15.319 Time zone registry key Software\Application\TimeZone\2
555399 HookB 4276 744 13:09:15.320 Use time zone: Western Europe Standard Time
555400 HookB 4276 744 13:09:15.320 GetLocalTime (Application.exe) returns 14 hours 9 minutes 15 seconds
555401 HookB 4276 744 13:09:15.320 GetLocalTime (Application.exe) calling GetTimeZone
[…]
Because this case study was synthesized from many real world examples it is not possible to provide its memory dump files and software trace logs. However, in order to help readers to play we have created a toy model problem. It is called MissingSpike application and it was created and compiled in Visual C++ 2008 Express Edition. The source code is very simple:
#include <windows.h>
#include <process.h>
void spiking_thread(void *param)
{
DWORD dwStart = GetTickCount();
while (GetTickCount() - dwStart < 1000*60)
{
OutputDebugString(L"Hello Computation!");
}
}
int _tmain(int argc, _TCHAR* argv[])
{
_beginthread(spiking_thread, 0, NULL);
while (true)
{
Sleep(100);
OutputDebugString(L"Hello Debugger!");
}
return 0;
}
The main thread creates the second one and the latter exits after 1 minute of CPU consumption. Both threads do software tracing. The second thread does it much more often to simulate high trace current and density.
If we run the application it starts consuming 50% CPU on a 2 processor machine. If we force a user dump after a few minutes we get this stack trace:
0:000:x86> ~*kL
. 0 Id: 14d8.16a4 Suspend: 0 Teb: 7efdb000 Unfrozen
ChildEBP RetAddr
003ffc0c 76c61380 ntdll_77170000!NtDelayExecution+0x15
003ffc74 76c60c88 kernel32!SleepEx+0x62
003ffc84 00a51064 kernel32!Sleep+0xf
003ffc94 00a511d6 MissingSpike!wmain+0x24
003ffcd8 76cdeccb MissingSpike!__tmainCRTStartup+0x10f
003ffce4 771ed24d kernel32!BaseThreadInitThunk+0xe
003ffd24 771ed45f ntdll_77170000!__RtlUserThreadStart+0x23
003ffd3c 00000000 ntdll_77170000!_RtlUserThreadStart+0x1b
If we didn’t have access to source code it would have been a complete surprise for us to see the absence of spiking threads:
0:000:x86> !runaway
User Mode Time
Thread Time
0:16a4 0 days 0:00:00.000
0:000:x86> !runaway f
User Mode Time
Thread Time
0:16a4 0 days 0:00:00.000
Kernel Mode Time
Thread Time
0:16a4 0 days 0:00:00.031
Elapsed Time
Thread Time
0:16a4 0 days 0:10:49.639
We also ran DebugView before dumping the process and recorded the following debugging output:
00000188 125.66216278 [5336] Hello Debugger!
00000189 125.76219177 [5336] Hello Debugger!
00000190 125.86218262 [5336] Hello Debugger!
00000191 125.96219635 [5336] Hello Debugger!
00000192 126.06215668 [5336] Hello Debugger!
00000193 126.16222382 [5336] Hello Debugger!
00000194 126.26220703 [5336] Hello Debugger!
00000195 126.36220551 [5336] Hello Debugger!
00000196 126.46221161 [5336] Hello Debugger!
00000197 126.56221008 [5336] Hello Debugger!
00000198 126.66223907 [5336] Hello Debugger!
00000199 126.76223755 [5336] Hello Debugger!
00000200 126.86223602 [5336] Hello Debugger!
00000201 126.96223450 [5336] Hello Debugger!
00000202 127.06224823 [5336] Hello Debugger!
00000203 127.16225433 [5336] Hello Debugger!
00000204 127.26225281 [5336] Hello Debugger!
00000205 127.36226654 [5336] Hello Debugger!
00000206 127.46225739 [5336] Hello Debugger!
00000207 127.56227875 [5336] Hello Debugger!
00000208 127.66227722 [5336] Hello Debugger!
00000209 127.76228333 [5336] Hello Debugger!
00000210 127.86228943 [5336] Hello Debugger!
00000211 127.96231079 [5336] Hello Debugger!
00000212 128.06227112 [5336] Hello Debugger!
00000213 128.16230774 [5336] Hello Debugger!
00000214 128.26232910 [5336] Hello Debugger!
00000215 128.36231995 [5336] Hello Debugger!
00000216 128.46232605 [5336] Hello Debugger!
00000217 128.56236267 [5336] Hello Debugger!
00000218 128.66233826 [5336] Hello Debugger!
00000219 128.76234436 [5336] Hello Debugger!
00000220 128.86239624 [5336] Hello Debugger!
00000221 128.96234131 [5336] Hello Debugger!
00000222 129.06233215 [5336] Hello Debugger!
00000223 129.16236877 [5336] Hello Debugger!
00000224 129.26237488 [5336] Hello Debugger!
00000225 129.36238098 [5336] Hello Debugger!
00000226 129.46238708 [5336] Hello Debugger!
00000227 129.56239319 [5336] Hello Debugger!
00000228 129.66239929 [5336] Hello Debugger!
00000229 129.76242065 [5336] Hello Debugger!
Next time we tried to save a user dump as soon as Task Manager showed high CPU consumption:

The dump revealed the new thread:
0:000:x86> ~*kL
. 0 Id: d5c.1708 Suspend: 0 Teb: 7efdb000 Unfrozen
ChildEBP RetAddr
0035f4ac 76c61270 ntdll_77170000!ZwWaitForSingleObject+0x15
0035f51c 76c611d8 kernel32!WaitForSingleObjectEx+0xbe
0035f530 76c81961 kernel32!WaitForSingleObject+0x12
0035f794 76cc9060 kernel32!OutputDebugStringA+0xef
0035f7b4 00a5106b kernel32!OutputDebugStringW+0x41
0035f7c4 00a511d6 MissingSpike!wmain+0x2b
0035f808 76cdeccb MissingSpike!__tmainCRTStartup+0x10f
0035f814 771ed24d kernel32!BaseThreadInitThunk+0xe
0035f854 771ed45f ntdll_77170000!__RtlUserThreadStart+0x23
0035f86c 00000000 ntdll_77170000!_RtlUserThreadStart+0x1b
1 Id: d5c.d58 Suspend: 0 Teb: 7efd8000 Unfrozen
ChildEBP RetAddr
00c7f6c0 76c61270 ntdll_77170000!ZwWaitForSingleObject+0x15
00c7f730 76c611d8 kernel32!WaitForSingleObjectEx+0xbe
00c7f744 76c83204 kernel32!WaitForSingleObject+0x12
00c7f9a8 76cc9060 kernel32!OutputDebugStringA+0x1ba
00c7f9c8 00a51027 kernel32!OutputDebugStringW+0x41
00c7f9dc 6ed932a3 MissingSpike!spiking_thread+0x27
00c7fa14 6ed9332b msvcr90!_callthreadstart+0x1b
00c7fa1c 76cdeccb msvcr90!_threadstart+0x5d
00c7fa28 771ed24d kernel32!BaseThreadInitThunk+0xe
00c7fa68 771ed45f ntdll_77170000!__RtlUserThreadStart+0x23
00c7fa80 00000000 ntdll_77170000!_RtlUserThreadStart+0x1b
Now we see that was a CPU consumer:
0:000:x86> !runaway f
User Mode Time
Thread Time
1:d58 0 days 0:00:04.102
0:1708 0 days 0:00:00.000
Kernel Mode Time
Thread Time
1:d58 0 days 0:00:06.598
0:1708 0 days 0:00:00.031
Elapsed Time
Thread Time
0:1708 0 days 0:00:49.505
1:d58 0 days 0:00:49.494
Less than a minute passed. After one minute the process becomes quiet from CPU consumption perspective:

Recorded DebugView trace has 246,100 lines in just one minute. Here is the fragment:
00245186 59.96057129 [3420] Hello Computation!
00245187 59.96061707 [3420] Hello Computation!
00245188 59.96066284 [3420] Hello Computation!
00245189 59.96070862 [3420] Hello Computation!
00245190 59.96075058 [3420] Hello Computation!
00245191 59.96082306 [3420] Hello Computation!
00245192 59.96085739 [3420] Hello Debugger!
00245193 59.96089554 [3420] Hello Computation!
00245194 59.96093750 [3420] Hello Computation!
00245195 59.96098328 [3420] Hello Computation!
00245196 59.96102905 [3420] Hello Computation!
00245197 59.96106720 [3420] Hello Computation!
00245198 59.96111298 [3420] Hello Computation!
00245199 59.96115494 [3420] Hello Computation!
00245200 59.96119690 [3420] Hello Computation!
00245201 59.96124268 [3420] Hello Computation!
The MissingSpike application and its symbols can be downloaded from:
http://www.dumpanalysis.org/Debugged/Sep09/ download/MissingSpike.zip
Artifacts of Inline User Mode Heap Analysis
Aditya K Sood, 19th of August, 2009
|
T |
his paper sheds light on the prerequisites for performing efficient user mode heap analysis. The paper derives the internal concepts to analyze user mode heaps in an appropriate manner irrespective of any component dependencies. For performing inline user mode heap analysis, a detailed subset of component based knowledge related to system functionality is required. Dumps provide an ample amount of information of a system state at the time of crash. The prime part is to scrutinize and dissect the memory structures in order to exhibit the behavior of the software when it crashes. Efficient analysis of heap dumps provides information about the real cause of a crash. If the crash is due to incessant vulnerability in the software or system, the analysis proves beneficial in determining the exploitability state of the bug persisting inside.
Acknowledgements: I would like to thank Mr. Dmitry Vostokov for sharing his different patterns and auspicious comments for the completion of this paper.
Nature of Process and Heap
Whenever a process is created or loaded into main memory, a heap is allocated to it. A process can have one or more than one memory heap. There are number of call stacks found in the process. For every single call stack, a heap memory is allocated to it. The major point of analysis in dumping the user mode heaps is to find the allocation pattern used for memory i.e. to analyze the real memory statistics. In this technique, the stress is not on generating the stack traces for various functional calls but to find the allocated memory structures. Before making another statement, let’s analyze the architecture that provides the memory dissemination between kernel and user mode.
It provides an overview of memory objects and the implementation peripherals. Let’s have a look at the generic points as mentioned below:
1. This architecture is system specific whether 32 bit or 64 bit operating system is used. The address space varies according to the operating system version. The 32 bit systems have 4GB of virtual address space where as 64 bit system can have up to 16TB of virtual address space. Mostly, the address space pattern is considered as an architectural constraint.
2. The virtual memory address space is divided equally between user mode and kernel mode by default. This can be altered by administrators based on the application requirements intrinsically. For Example, by default 32 bit systems have 2GB of address space for both user and kernel mode. But it can be made to 3GB to user mode and 1 GB to kernel mode. If you analyze carefully, then user mode is subjected to run a number of applications and the virtual address space varies with the size of an application.
3. In order to reduce memory constraints, this step of increasing address space for user mode is preferred. Generally, the system code has standard base addresses defined for a number of system functions. These internal functions are located throughout at the same base address. Due to this reason the alterations are undertaken. If you remember, then ASLR is introduced for making addresses randomized even for user mode. Of course, it reduces the extent of exploitation of internal structures. One factor leads to another. Virtual memory is a kind of flexible memory allocated to applications for robust running.
4. The logical addresses are mapped to real addresses by hardware of the system with respect to operating software. As soon as the application is loaded into memory, the logical address space is divided into fixed sized chunks called as Pages. Actually, the virtual memory is considered as secondary memory and the primary memory is hardware specific.
5. A continuous mapping action is performed between primary to secondary memory for running application. The whole process is dynamic. When an application calls certain addresses from virtual memory, the specific pages are loaded into the main memory where the pages that are not referenced remain there in the secondary memory. The active pages from the virtual memory used by an application are called as Working Sets. The execution of any application running as a process depends on these working sets. Performance of heavy applications depends on the hardware too. A heavy memory operation affects the I/O mechanism of the system in a stringent manner. The real memory defines the number of bits that are associated with a memory address.

The above points provide a structural component dependency and working behavior with respect to operating system.
Global Fags – System Wide Dump Image Settings
System wide debugging, tracing and dump analysis depend on the configuration of global flags. These are the standard flags that define the behavior of operating system related to crashes or error generation. There are a number of processes that are activated simultaneously. In a complex environment, the system software is prone to hard crashes. In order to circumvent this situation or to handle the rogue conditions successfully, the system should be configured with Instant Debugging Checks. The setting of global flags results in effective debugging and dump analysis.
The global flags are set on different images. Every single image has a different set of global flags. This is because different images correlate to different processes. So there must be a different procedure to handle the different images based on the sphere of functionality. This makes the debugging process differential because it’s easy to debug the unique image by setting global flags. The image will be dealt with the defined flags that are set in a global manner. Windows Debugger consists of a bang command as [!gflags]. This command lists the exported entries from system wide setting of global flags. Basically NtGlobalFlag structure is queried for overall information. Precisely, definitive steps to be followed for core analysis are:
1. The debugging checks are to be implemented with additional global flags. This is due to the fact that certain flags are not enabled by default. This creates intrinsic problem when debugging analysis has to be done. Usually, it has been noticed that analyzing kernel level dumps are hard. The real cause is not the operating system but the debugging parameters which are not configured appropriately. Due to this, the information in the dumps is not dissected efficiently which makes it hard for the reverse engineer. So, it is advisable to understand the purpose and the parameters required for it.
Let’s have a look at the snapshot for different Global flag settings:

2. Always specify the image for debugging process in a unique manner. The working curvature of the process should be undertaken prior to initiating debugging. This means the reverse engineer should analyze the working functionality of threads. It provides information related to the working modes i.e. whether the threads are spending time in user mode or kernel mode. Basically, try to find the interdependencies by performing cross functional analysis. This favors the process of setting global flags.
3. The registry settings play a critical role in robust debugging. The setting of global flags has a direct impact on the system registry. That’s why it is considered very critical to alter registry in this manner. A simple mistake or wrong configuration can lead to irrecoverable losses. This is because it affects the kernel state directly. That’s why one must have seen BSOD, HAL missing or corrupt etc messages when something bad happens at the kernel level. So the registry should be tempered carefully.
4. When the global flags are set for HEAP operations, then it is defined for User Mode. The same structural implementation in kernel mode is done with POOL operation. So, when the flags are configured for pool operations, it is implemented for kernel level operations.
GFLAGS are used to create User Mode Stack Trace Database. It is primarily related to set windows properties to capture the stack traces for analyzing different heaps.
Pseudo Registers
Pseudo as the name suggests is not exactly what it seems. A pseudo register is not taken as the hardware register but it works like that i.e. it holds the functionality of a hardware register. This register helps you to traverse the debugger for specific values.
· First of all, fire your debugger for active debugging of any process
· Set a breakpoint in the code
· Set the watch window and put an entry of a pseudo register
· Analyze the breakpoint
Example:
@Err is the defined pseudo register. This is placed in the watch window. Its very first value is 0 which actually sets the code for GetLastError () function. So, when an analyst traverses the debugged code and any fault occurs the value will change accordingly.
Let’s look into one example:
FILE hfile = OpenFile( LPCSTR lpFileName,
LPOFSTRUCT lpReOpenBuff,
UINT uStyle );
A code snippet is provided above. Like, if a debugging breakpoint is set and the code is executed, the pseudo register conditional value is checked by the debugger. If the specified value of @ERR matches with the execution flow, the breakpoint will execute. If we synthesize it properly, then we will get error number 2 response. It means the handle to the file failed as no file name is specified. This turns out to be useful in direct modular check of applied functions. The pseudo registers are reliable in checking conditional debugging as per the modular specifications. Generically, the pseudo registers are effective in scrutinizing the return value of conditional modules.
char szProcessName[MAX_PATH] = "unknown";
HANDLE hProcess =
OpenProcess( PROCESS_QUERY_INFORMATION |
PROCESS_VM_READ, FALSE,
processID );
if ( NULL != hProcess )
{
HMODULE hMod;
DWORD cbNeeded;
if ( EnumProcessModules( hProcess, &hMod,
sizeof(hMod),&cbNeeded) )
{
GetModuleBaseName( hProcess, hMod,
szProcessName,sizeof(szProcessName) );
}
else return;
}
else return;
printf( "%s (Process ID: %u)\n", szProcessName,
processID );
CloseHandle( hProcess );
Now, we select a breakpoint and set the value of @ERR register to 2 i.e. @ERR==2 or any other GetLastError () value. When the debugger is triggered the condition is checked against given @ERR pseudo register value. If the value specific error is matched, the debugger breaks the execution flow there by displaying the various register positions. If the @ERR value does not match, the debugger will not break the application even when any other error has occurred.
The list of other pseudo registers is mentioned below:
@TIB = Thread information block for the current thread; necessary because the debugger doesn't handle the "FS:0" format
@CLK = Undocumented clock register; usable only in the Watch window
@EAX, @EBX, @ECX, @EDX, @ESI, @EDI, @EIP, @ESP, @EBP, @EFL = Intel CPU registres
@CS, @DS, @ES, @SS, @FS, @GS = Intel CPU segment registers
@ST0, @ST1, @ST2, @ST3, @ST4, @ST5, @ST6, @ST7 = Intel CPU floating-point registers
All these registers play a crucial role in user modes dump analysis process.
Practical User Mode Heap Constructs
The above stated facts crystallize the important points taken into consideration while analyzing user mode heaps. The procedure can be implemented as:
1. Process Specific Heap Analysis – Dynamic
2. Log Specific Heap Analysis – Static
The analyst captures the pointer related information by active debugging during the process for Heap analysis. There is no restriction on the number of heaps to be analyzed, rather all the heaps structured by the process will be scrutinized in a detailed manner.
The information mentioned below is the most critical in analyzing user mode heaps:
· Size of Allocation
· Size of Overhead
· Pointer to the Structures
· Allocated Stacks
· Active Memory Allocation – Definitive Heaps
Before doing a generic analysis, some steps should be followed to optimize the analysis:
1. Always define the Page Heap Verification (Full) for the image which is to be analyzed
2. Always use the command [!analyze –v] in user mode to attain the maximum information about the crash.
3. Try to look at the Exception record with command [.exr]
4. Analyze the Memory Corruption instances by checking the integrity of the image through [!chkimg]
5. Traversing the Address information in the loaded modules through [ln] command.
6. Set the conditional breakpoint in dynamic analysis of process through [bp] command.
7. Always try to test the pseudo registers with breakpoints.
Tools in Practice
In order to implement this functional strategy, the preferred tools except from OllyDbg and IDA Pro are mentioned below which are designed specifically for heap analysis:
1. The UMDH by Microsoft for analyzing heap dumps occurring on Windows platforms[8].
2. The HDUMP, an Open Source tool to analyze heap dumps of Java. One requires a JPROF Java profiler for providing data[9].
3. The userdump.exe is also a good tool for collecting dumps of active processes.
4. The gflafs.exe is used to set the global flags for an image to be analyzed against heaps.
5. One can design custom tools by using Heap API’s.
The above mentioned tools are used effectively to trace.
Conclusion
Analyzing user mode heaps in an effective manner requires a structural and hierarchical approach for looking at a specific set of information out of memory dumps. If the information is scrutinized by applying well defined methods, then the benchmarks result in effective outcomes. So, in order to critically examine the user mode heap dumps, the artifacts should be cleared and applied considering the dependencies of different components.
About Author
Aditya K Sood is working as a Senior Security Researcher at Vulnerability Research Labs, COSEINC. He is also a founder of SecNiche Security, an independent security research arena for cutting edge research. He is having an experience of more than 6 years in the security world. He holds BE and MS in Cyber Law and Information Security. He is an active speaker at conferences like EuSecwest, XCON, Troopers, XKungfoo, OWASP, Clubhack, and CERT-IN. He has written for journals Hakin9, BCS, Usenix and Elsevier. His work has been quoted at eWeek, SCMagazine and ZDNet. He has given a number of advisories to forefront companies.
Colorimetric Tracing: A Visual Approach to Tracking Function Calls
Dmitry Vostokov, 20th September 2009 (revised
27th November 2009)
http://www.dumpanalysis.org
|
S |
ometimes we need to know whether a function was called or not. Traditional non-invasive approach without setting and triggering debugger breakpoints is to use diagnostic software tracing and record a message when program execution enters a function. The author applied the method of colorimetric computer memory dating[10] to record the function prolog entrance. The idea is to allocate a static buffer for every function we want to trace and fill it with a characteristic RGB patten upon entrance. Here is the sample code created for testing purposes:
#include "stdafx.h"
typedef unsigned int ISOMEMOTOPE;
#define COLORED_STACK_SIZE 0x35000
#define RED 0x00FF0000
#define GREEN 0x0000FF00
#define BLUE 0x000000FF
#define COLORED_PROLOG(rgbaValue) { \
static ISOMEMOTOPE \
coloredStack[COLORED_STACK_SIZE]; \
for (int i = 0; i < COLORED_STACK_SIZE; ++i) \
coloredStack[i] = (rgbaValue); }
#define SLEEP_TIME 10*1000
void fooR()
{
COLORED_PROLOG(RED);
}
void fooG()
{
COLORED_PROLOG(GREEN);
}
void fooB()
{
COLORED_PROLOG(BLUE);
}
void barP(void *pData)
{
COLORED_PROLOG((ISOMEMOTOPE)pData);
}
void thread_second(void *)
{
barP((void *)0x00FFFF00);
Sleep(SLEEP_TIME);
}
int _tmain(int argc, _TCHAR* argv[])
{
puts("Time to dump: 1\n");
Sleep(SLEEP_TIME);
fooR();
puts("Time to dump: 2\n");
Sleep(SLEEP_TIME);
fooB();
puts("Time to dump: 3\n");
Sleep(SLEEP_TIME);
barP((void *)-1);
puts("Time to dump: 4\n");
Sleep(SLEEP_TIME);
barP(NULL);
puts("Time to dump: 5\n");
Sleep(SLEEP_TIME);
fooG();
_beginthread(thread_second, 0, NULL);
Sleep(SLEEP_TIME/10);
puts("Time to dump: 6\n");
Sleep(SLEEP_TIME);
return 0;
}
We also save a process memory dump each time the running program prints “Time to dump” message (we used Windows Vista Task Manager):


Then Dump2Picture[11] tool was used to convert resulting memory dumps into bitmap files. The first process dump shows large preallocated zero-filled coloredStack arrays grouped together into big black region in the center:

The second process dump was saved just after fooR function call that filled one such static array with a red pattern:
The third process memory dump was saved after fooB function call that filled another static array with a blue pattern:

The 4th memory dump was saved after fooP function call that had its fill pattern as a function parameter, in our case, -1, RGB (255, 255, 255) or white color:
The 5th process dump was saved after fooP function was called with another fill parameter, this time 0, and it cleared white region:
The 6th memory dump was saved after fooG function call that filled another memory buffer with a green color pattern and the second thread started calling fooP function with a yellow color fill pattern:
It is also possible to extend this approach to record different colors, for example, RGB (0,0,1), RGB (0,0,2), RGB (0,0,3), ... to show visually how often a function was called.
Idea MiniDumps
|
A |
note from the editor: this is a new section to record ideas that have been planned to materialize in full “memory dump” mode (full articles) in the current magazine issue but due to various space and time constraints are left as minidumps (idea descriptions) only and are left for subsequent issues.
JIT Stack Trace Script and Library
Dmitry
Vostokov and Thomas Monahan, 5th
August 2009
http://www.citrix.com
|
H |
ere we record an idea of a special just-in-time library to save current stack traces whenever we need it during process execution. The same goes for a WinDbg script one of the authors had done already to set a breakpoint to append a stack trace to a log file when triggered. We promise to publish full source code in the next magazine issues.
SoftWeet Library
Dmitry
Vostokov, 28th
September 2009
http://www.softweet.com
|
T |