You use the SOS debugger extension’s !U command.
Yeah, that’s not very helpful. Here’s how to get from that terse description to something useful.
As an example, let’s use the following trivial program:
using System; public class Simple { public static int Main() { Console.WriteLine( "Hello World!" ); Console.ReadLine(); return 0; } }
Dump this snippet into Notepad, save as simple.cs file and compile it. We want to see what it’ll be like when the user runs it, so we’ll do a release build. From the VS2005 Command Prompt, run csc /debug:pdbonly /o+ simple.cs
. (I find it’s quicker to do command-line compiles for simple test programs than firing up Visual Studio. It’s not as if we’re designing a Form.) If you’re doing this on a 64–bit computer, like me, add /platform:x86
so we get consistent output across platforms. (Marking as x86 means that the executable headers are set to indicate a 32–bit program. The IL is the same whatever you set here, but if you leave it as anycpu
, the default, the .NET Framework will create a 64–bit process which obviously gives different output.)
Why call Console.ReadLine? Well, the JIT compiler is helpful (or at least it thinks it is). If you start a program under the debugger, it will generate less optimized code because it thinks you want to debug it, but that changes the code from what the user will see. So I want to add a stop point in the program so I can attach the debugger after the process has started.
The next thing you’ll need is the Debugging Tools for Windows kit. Grab the 32–bit kit – the 64–bit kit can debug 32–bit processes, but SOS (the debugger extension DLL which implements the commands we’re going to use) doesn’t appear to work. We’re going to use WinDBG, which is slightly friendlier than the other debuggers although not a lot!
Open WinDBG. Run simple.exe and, when it’s waiting for input, go to File/Attach to a Process in WinDBG. You’ll see a list of other processes (and if running Windows Vista with UAC enabled, or on XP as a standard user, a load of access denied errors). Select simple.exe from the list and hit OK. WinDBG automatically stops once you’ve attached to a process so you can start manipulating the program straight away. This is the output I got:
Microsoft (R) Windows Debugger Version 6.8.0004.0 X86 Copyright (c) Microsoft Corporation. All rights reserved. *** wait with pending attach Symbol search path is: C:\Windows;C:\Windows\System32;C:\Windows\SysWOW64;SRV*C:\WebSymbols*http://msdl.microsoft.com/download/symbols Executable search path is: ModLoad: 008c0000 008c8000 C:\Users\Mike\Documents\Programming\simple.exe ModLoad: 776a0000 77800000 C:\Windows\SysWOW64\ntdll.dll ModLoad: 79000000 79046000 C:\Windows\system32\mscoree.dll ModLoad: 75b70000 75c80000 C:\Windows\syswow64\KERNEL32.dll ModLoad: 76ce0000 76da6000 C:\Windows\syswow64\ADVAPI32.dll ModLoad: 76e10000 76f00000 C:\Windows\syswow64\RPCRT4.dll ModLoad: 75850000 758b0000 C:\Windows\syswow64\Secur32.dll ModLoad: 75a80000 75ad8000 C:\Windows\syswow64\SHLWAPI.dll ModLoad: 77110000 771a0000 C:\Windows\syswow64\GDI32.dll ModLoad: 77040000 77110000 C:\Windows\syswow64\USER32.dll ModLoad: 76c30000 76cda000 C:\Windows\syswow64\msvcrt.dll ModLoad: 75d00000 75d60000 C:\Windows\system32\IMM32.DLL ModLoad: 76940000 76a08000 C:\Windows\syswow64\MSCTF.dll ModLoad: 75a70000 75a79000 C:\Windows\syswow64\LPK.DLL ModLoad: 759c0000 75a3d000 C:\Windows\syswow64\USP10.dll ModLoad: 746c0000 7485e000 C:\Windows\WinSxS\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.6001.18000_none_5cdbaa5a083979cc\comctl32.dll ModLoad: 79e70000 7a3ff000 C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll ModLoad: 75390000 7542b000 C:\Windows\WinSxS\x86_microsoft.vc80.crt_1fc8b3b9a1e18e3b_8.0.50727.1434_none_d08b6002442c891f\MSVCR80.dll ModLoad: 75e30000 7693f000 C:\Windows\syswow64\shell32.dll ModLoad: 771b0000 772f4000 C:\Windows\syswow64\ole32.dll ModLoad: 790c0000 79bf6000 C:\Windows\assembly\NativeImages_v2.0.50727_32\mscorlib\5b3e3b0551bcaa722c27dbb089c431e4\mscorlib.ni.dll ModLoad: 79060000 790b6000 C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorjit.dll (1dc.12b4): Break instruction exception - code 80000003 (first chance) eax=7efaf000 ebx=00000000 ecx=00000000 edx=7770d2d4 esi=00000000 edi=00000000 eip=776b0004 esp=04c6fa00 ebp=04c6fa2c iopl=0 nv up ei pl zr na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246 ntdll!DbgBreakPoint: 776b0004 cc int 3
You enter your debugging commands in the edit box at the bottom, next to the prompt 0:003>. This means we’re debugging the 0th process we’re attached to, and our commands by default apply to thread number 3. Four threads? Well, thread 0 is our main thread, thread 1 was created by the CLR’s debugging support in case we attached a debugger, thread 2 is the finalizer thread, and thread 3 was just created by WinDBG so it could stop the process safely. (You can see this by running ~* k, although you’ll need to be set up to get debugging symbols from the symbol server to get good stack traces.) The Debugging Tools debuggers automatically stop the process when you attach, so that you can start manipulating the process straight away.
The first thing we need to do is ask WinDBG to load the SOS extension. This is installed with the CLR, so we ask it to load from the same folder that mscorwks.dll (the DLL which implements the virtual machine, effectively the guts of the CLR) lives in:
0:003> .loadby sos.dll mscorwks
Now we can see what state of the managed threads are in by running !threads. Any command beginning ! comes from a debugger extension DLL, and they can be disambiguated if necessary, but they’re searched in reverse order loaded (last loaded = first searched) so anything we want will be found in SOS anyway.
Right. To see what we can do, run !help.
0:003> !help ------------------------------------------------------------------------------- SOS is a debugger extension DLL designed to aid in the debugging of managed programs. Functions are listed by category, then roughly in order of importance. Shortcut names for popular functions are listed in parenthesis. Type "!help" for detailed info on that function. Object Inspection Examining code and stacks ----------------------------- ----------------------------- DumpObj (do) Threads DumpArray (da) CLRStack DumpStackObjects (dso) IP2MD DumpHeap U DumpVC DumpStack GCRoot EEStack ObjSize GCInfo FinalizeQueue EHInfo PrintException (pe) COMState TraverseHeap BPMD Examining CLR data structures Diagnostic Utilities ----------------------------- ----------------------------- DumpDomain VerifyHeap EEHeap DumpLog Name2EE FindAppDomain SyncBlk SaveModule DumpMT GCHandles DumpClass GCHandleLeaks DumpMD VMMap Token2EE VMStat EEVersion ProcInfo DumpModule StopOnException (soe) ThreadPool MinidumpMode DumpAssembly DumpMethodSig Other DumpRuntimeTypes ----------------------------- DumpSig FAQ RCWCleanupList DumpIL
To find out a little more about !U, let’s run !help U:
0:003> !help U
-------------------------------------------------------------------------------
!U [-gcinfo] [-ehinfo] <MethodDesc address> | <Code address>
Presents an annotated disassembly of a managed method when given a MethodDesc
pointer for the method, or a code address within the method body. Unlike the
debugger "U" function, the entire method from start to finish is printed,
with annotations that convert metadata tokens to names.
...
03ef015d b901000000 mov ecx,0x1
03ef0162 ff156477a25b call dword ptr [mscorlib_dll+0x3c7764 (5ba27764)] (System.Console.InitializeStdOutError(Boolean), mdToken: 06000713)
03ef0168 a17c20a701 mov eax,[01a7207c] (Object: SyncTextWriter)
03ef016d 89442414 mov [esp+0x14],eax
If you pass the -gcinfo flag, you'll get inline display of the GCInfo for
the method. You can also obtain this information with the !GCInfo command.
If you pass the -ehinfo flag, you'll get inline display of exception info
for the method. (Beginning and end of try/finally/catch handlers, etc.).
You can also obtain this information with the !EHInfo command.
Right, so we need the address of a MethodDesc structure, or the address of the code itself. How can we get one of these? There’s a command called DumpMD…
0:003> !help dumpmd ------------------------------------------------------------------------------- !DumpMDThis command lists information about a MethodDesc. You can use !IP2MD to turn a code address in a managed function into a MethodDesc: 0:000> !dumpmd 902f40 Method Name: Mainy.Main() Class: 03ee1424 MethodTable: 009032d8 mdToken: 0600000d Module: 001caa78 IsJitted: yes m_CodeOrIL: 03ef00b8 If IsJitted is "yes," you can run !U on the m_CodeOrIL pointer to see a disassembly of the JITTED code. You can also call !DumpClass, !DumpMT, !DumpModule on the Class, MethodTable and Module fields above.
We’re getting a bit closer. Let’s try DumpModule…
0:003> !help DumpModule ------------------------------------------------------------------------------- !DumpModule [-mt]You can get a Module address from !DumpDomain, !DumpAssembly and other functions. Here is sample output: 0:000> !dumpmodule 1caa50 Name: C:\pub\unittest.exe Attributes: PEFile Assembly: 001ca248 LoaderHeap: 001cab3c TypeDefToMethodTableMap: 03ec0010 TypeRefToMethodTableMap: 03ec0024 MethodDefToDescMap: 03ec0064 FieldDefToDescMap: 03ec00a4 MemberRefToDescMap: 03ec00e8 FileReferencesMap: 03ec0128 AssemblyReferencesMap: 03ec012c MetaData start address: 00402230 (1888 bytes) The Maps listed map metadata tokens to CLR data structures. Without going into too much detail, you can examine memory at those addresses to find the appropriate structures. For example, the TypeDefToMethodTableMap above can be examined: 0:000> dd 3ec0010 03ec0010 00000000 00000000 0090320c 0090375c 03ec0020 009038ec ... This means TypeDef token 2 maps to a MethodTable with the value 0090320c. You can run !DumpMT to verify that. The MethodDefToDescMap takes a MethodDef token and maps it to a MethodDesc, which can be passed to !DumpMD. There is a new option "-mt", which will display the types defined in a module, and the types referenced by the module. For example: 0:000> !dumpmodule -mt 1aa580 Name: C:\pub\unittest.exe ... ... MetaData start address: 0040220c (1696 bytes) Types defined in this module MT TypeDef Name ------------------------------------------------------------------------------ 030d115c 0x02000002 Funny 030d1228 0x02000003 Mainy Types referenced in this module MT TypeRef Name ------------------------------------------------------------------------------ 030b6420 0x01000001 System.ValueType 030b5cb0 0x01000002 System.Object 030fceb4 0x01000003 System.Exception 0334e374 0x0100000c System.Console 03167a50 0x0100000e System.Runtime.InteropServices.GCHandle 0336a048 0x0100000f System.GC
We still need a Module to pass to this command, but we’re getting there. Maybe DumpDomain can help us? I’ll skip the help text this time.
0:003> !DumpDomain -------------------------------------- System Domain: 7a3bc8b8 LowFrequencyHeap: 7a3bc8dc HighFrequencyHeap: 7a3bc934 StubHeap: 7a3bc98c Stage: OPEN Name: None -------------------------------------- Shared Domain: 7a3bc560 LowFrequencyHeap: 7a3bc584 HighFrequencyHeap: 7a3bc5dc StubHeap: 7a3bc634 Stage: OPEN Name: None Assembly: 0052f848 -------------------------------------- Domain 1: 00516800 LowFrequencyHeap: 00516824 HighFrequencyHeap: 0051687c StubHeap: 005168d4 Stage: OPEN SecurityDescriptor: 00517d30 Name: simple.exe Assembly: 0052f848 [C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll] ClassLoader: 0050e470 SecurityDescriptor: 0051f5d0 Module Name 790c2000 C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll Assembly: 00535ac0 [C:\Users\Mike\Documents\Programming\simple.exe] ClassLoader: 0050e630 SecurityDescriptor: 00535a38 Module Name 000f2c3c C:\Users\Mike\Documents\Programming\simple.exe
At last, we have the address of something we can use. Let’s get the module information for simple.exe. For that we need the address next to it in the module list for the simple.exe assembly.
0:003> !dumpmodule -mt f2c3c Name: C:\Users\Mike\Documents\Programming\simple.exe Attributes: PEFile Assembly: 00535ac0 LoaderHeap: 00000000 TypeDefToMethodTableMap: 000f0038 TypeRefToMethodTableMap: 000f0040 MethodDefToDescMap: 000f005c FieldDefToDescMap: 000f0068 MemberRefToDescMap: 000f006c FileReferencesMap: 000f0088 AssemblyReferencesMap: 000f008c MetaData start address: 008c206c (740 bytes) Types defined in this module MT TypeDef Name ------------------------------------------------------------------------------ 000f3030 0x02000002 Simple Types referenced in this module MT TypeRef Name ------------------------------------------------------------------------------ 790fd0f0 0x01000001 System.Object 79101118 0x01000006 System.Console
Now we can get the MethodTable for the Simple class:
0:003> !dumpmt -md f3030 EEClass: 000f1174 Module: 000f2c3c Name: Simple mdToken: 02000002 (C:\Users\Mike\Documents\Programming\simple.exe) BaseSize: 0xc ComponentSize: 0x0 Number of IFaces in IFaceMap: 0 Slots in VTable: 6 -------------------------------------- MethodDesc Table Entry MethodDesc JIT Name 79371278 7914b928 PreJIT System.Object.ToString() 7936b3b0 7914b930 PreJIT System.Object.Equals(System.Object) 7936b3d0 7914b948 PreJIT System.Object.GetHashCode() 793624d0 7914b950 PreJIT System.Object.Finalize() 00310070 000f3020 JIT Simple.Main() 000fc01c 000f3028 NONE Simple..ctor()
Finally we have something we can pass to !U. Note the JIT column. Here, PreJIT means that the code has come from a native image generated by ngen, JIT means that the JIT compiler in this process has generated the code, and NONE means that the method hasn’t been compiled yet. We didn’t override any of System.Object’s virtual methods, so they occupy the first four places in our MethodTable.
I placed the call to Console.ReadLine at the end of the program so everything has already been compiled – remember, the JIT compiles code on-demand, one method at a time, when that method is called (barring very simple methods which might be inlined into their callers). Let’s just get on and show the code for Main.
0:003> !U f3020 Normal JIT generated code Simple.Main() Begin 00310070, size 36 00310070 833d8c10920300 cmp dword ptr ds:[392108Ch],0 00310077 750a jne 00310083 00310079 b901000000 mov ecx,1 0031007e e8e1580579 call mscorlib_ni+0x2a5964 (79365964) (System.Console.InitializeStdOutError(Boolean), mdToken: 06000770) 00310083 8b0d8c109203 mov ecx,dword ptr ds:[392108Ch] (Object: System.IO.TextWriter+SyncTextWriter) 00310089 8b153c309203 mov edx,dword ptr ds:[392303Ch] ("Hello World!") 0031008f 8b01 mov eax,dword ptr [ecx] 00310091 ff90d8000000 call dword ptr [eax+0D8h] 00310097 e84c750d79 call mscorlib_ni+0x3275e8 (793e75e8) (System.Console.get_In(), mdToken: 0600076e) 0031009c 8bc8 mov ecx,eax 0031009e 8b01 mov eax,dword ptr [ecx] 003100a0 ff5064 call dword ptr [eax+64h] 003100a3 33c0 xor eax,eax 003100a5 c3 ret
Interesting. The JIT clearly sees Console.WriteLine(string) and Console.ReadLine as trivial and has inlined the calls. In turn it’s inlined Console.get_Out. The reference to SyncTextWriter isn’t the JIT being clever – it’s SOS telling us that the address 0x0392108C is the base of SyncTextWriter’s vtable, as this call to TextWriter.WriteLine(string) is virtual.
I realise, and hope, that this isn’t everyday usage. More often than not you’ll have crashed somewhere and want to find out where (for which see !IP2MD). But it can be useful to see just how your C# code turns into machine instructions.