An interesting detail about Control Flow Guard
On Windows systems, before Windows 8.1 update 3, C code calling a function pointer used to be compiled to just a simple “call register” instruction; for example, in a 32bit process:
Starting with Windows 8.1 update 3, in all system libraries, it is more complicated:
mov ecx, esi
This is how Control Flow Guard VC++ compiler feature works. Briefly, the ___guard_check_icall_fptr function checks whether the call target is a legal location – if it is not the beginning of a nonstatic function, ___guard_check_icall_fptr function aborts the execution. Therefore, attacker who controls a function pointer cannot immediately reach arbitrary code (say, “xchg eax, esp; ret” stack pivot gadget, that is always in a middle of a function). It significantly complicates exploiting of vulnerabilities based on heap corruption. All libraries and executables shipped with recent Windows systems (particularly, Internet Explorer) benefit from this protection.
- call a legal function, that allows you to hijack control flow, e.g. NtContinue
- transit via code that is not compiled with CFG, e.g. flash JIT.
Now, after the above introduction, we are getting to the point of this post.
In some Windows 10 Technical Preview dlls, a function call is compiled into even more complicated assembly:
mov ebx, esp
mov ecx, edi
cmp ebx, esp
jz short loc_6380DEF5
mov ecx, 4
/* code after function call */
This code saves the stack pointer in ebx register, does the ___guard_check_icall_fptr
and the actual function call, and then checks that the stack pointer is unchanged (if it is changed, int 29h terminates the process). Why is this extra effort with esp checking needed?
Most likely, the answer is: to fix another CFG bypass method. I have not seen any explanation explicitely related to VC++ CFG (particularly, this detail is not covered in the papers mentioned above), but a very related technique can be found in the excellent “Out of Control” paper. In this paper, authors were able to bypass another solution that imposed restrictions on the control flow, by reaching the following code (simplified for readability):
call eax [*] ; eax controlled by an attacker
The value of eax at the moment of the call had been checked to be in a certain set of functions (so no call to the middle of the function is allowed). The problem was that the set of allowed functions included both stdcall functions (that remove the arguments from the stack in their epilogue) and cdecl functions (that do not remove arguments from the stack). In the above disassembly, it is apparent that the target is meant to be a stdcall function. If we point eax to a cdecl function, then after it returns, the stack is desynchronized – on its top, instead of the return address, there is the attacker-controlled argument. Therefore, the “ret” instruction will transfer execution to the location of attacker’s choice.
On recent Windows 10 Technical Preview build, 32bit versions of ieframe.dll, jscript9.dll and mshtml.dll include this extra check for stack pointer sanity. However, other dlls do not have this check. Is there a suitable function in system libraries that we can transit through and achieve arbitrary EIP ?
I spent quite some time looking for a real life example (particularly, I wrote a scanner that tried each location in all dlls loaded by 32bit IE renderer) but I returned empty-handed. Admittedly, the requirements are strict – this function must not use frame pointer, and it must call a controllable function.
I had more luck with somewhat reversed approach – find a function that expects to call cdecl function pointer, and feed it a stdcall function. The jackpot (on a recent Windows 10 Technical Preview build, in syswow64 libraries) is: kernel32!Windows::Globalization::Calendars::YearMonthCalendar::AddEras. Its pseudocode is:
indirect_call reg1; // checked with CFG; we control reg1
push controlled_value1 // argument to the below function call
indirect_call reg2; // checked with CFG; we control reg2
Here, reg1 was meant to point to a cdecl function. The trick is to point reg1 to a stdcall function, that will remove a few words from the stack, so that after its return, esp will point to jackpot’s saved return address. The “push controlled_value1” instruction will overwrite jackpot’s saved return address. reg2 should point to a cdecl function. Then, when returning, jackpot will transfer control to a location chosen by an attacker – for instance, to a stack pivot gadget.
Therefore, this “stack desynchronization” technique is another generic (and real life) method to bypass CFG; applicable for 32bit processes only. On 64bit architecture, it is always the caller who removes the call arguments from the stack (note that the first four arguments are passed in registers, but the additional arguments are passed on the stack) – so, no chance for the stack pointer to land in an unexpected location.