TSX improves timing attacks against KASLR
… and similarly a long blog is a nuisance, so I managed to squeeze the essence of it into a single sentence, the title. If it is not entirely clear, read on.
A typical privilege escalation exploit based on a kernel vulnerabilit yworks by corrupting the kernel memory in a way advantageous for an attacker. Two scenarios are possible:
- arbitrary code execution in kernel mode
- data-only; just alter kernel memory so that privileges are elevated (e.g. change the access token of the current process)
Usually, the first method is most straightforward, and most flexible. With SMEP enabled, attacker cannot just divert kernel execution into code stored in usermode pages; more work is needed, a generic method being ROP in kernel body.
Usually, both above methods require some knowledge about kernel memory layout. Particularly, in order to build a ROP chain in kernel body, we need to know the base of the kernel or a driver. In windows 8.1 significant changes were introduced to not give to the attacker the memory layout for free. They affect only processes running with integrity level lower than medium, but it is the most interesting case, as OS-based sandboxes encapsulate untrusted code in such a process. The effectiveness of Windows 8.1 KALSR depends on the type of vulnerability primitive available to the attacker:
- If one can read and write arbitrary address with arbitrary contents multiple times, there is no problem, as one can learn the full memory layout.
- If one can overwrite arbitrary address with arbitrary content at least twice, then a generic method is to overwrite the IDT entry for interrupt Y with usermode address X (note IDT base can be obtained via unprivileged sidt instruction), and then change the type of page holding X so that it becomes a superuser page (possible because page table entries are at known location). Finally, trigger code execution with int Y instruction. I assume it is message of this post, although they do not discuss how to locate a kernel code pointer (meaning, beat KASLR) that subsequently should be overwritten. In some cases we do not need to know the address of the kernel code pointer, e.g. if it lives in an area that we can overflow or alter under use-after-free condition.
- If one can just control a kernel function pointer, then… We can divert execution neither to usermode (because of SMEP) nor to kernelmode (because of KASLR we do not know addresses of any ROP gadget). Any hints?
Recently I played with a vulnerability from the third category, and (at least for me) KASLR provided significant resistance. It can be argued that there is potential for kernel bugs that leak some pointers and thus allow to bypass KASLR, but something more generic would be better.
Timing attacks against KASLR
An excellent paper describes methods to bypass KASLR via timing attacks. One of the discussed methods is: in usermode, access kernel address X, and measure the time elapsed until the usermode exception handler is invoked. The point is that even though usermode access to X throws a page fault regardless whether X is in mapped kernel memory or not, the timings are different. When we recover the list of mapped pages, we can infer the kernel base (or some driver base) and consequently the addresses of useful code snippets in it – all we need to build a ROP chain.
Timing attacks need to take care of the inherent noise. Particularly, on Windows invoking the usermode exception handler requires a lot of CPU instructions, and the difference in timing can be difficult to observe. It would be much better if the probing did not result in the whole kernel page fault handler executing. Any hints?
TSX to the rescue
Haswell Intel CPUs introduced the “transactional synchronization extensions”. It means than only recent CPUs support them; moreover, Intel recommends disabling them via microcode update, as they are apparently not reliable. Yet we may assume that someday they will be fixed and become widespread.
TSX makes kernel address probing much faster and less noisy. If an instruction executed within XBEGIN/XEND block (in usermode) tries to access kernel memory, then no page fault is raised – instead transaction abort happens, so execution never leaves usermode. On my i7-4800MQ CPU, the relevant timings, in CPU cycles, are (minimal/average/variance, 2000 probes, top half of results discarded):
- access in TSX block to mapped kernel memory: 172 175 2
- access in TSX block to unmapped kernel memory: 200 200 0
- access in __try block to mapped kernel memory: 2172 2187 35
- access in __try block to unmapped kernel memory: 2192 2213 57
The difference is visible with naked eye; an attack using TSX is much simpler and faster.
Two points as a take-away:
- KASLR can be bypassed in an OS-independed way; this post describes how the existing techniques can be improved utilizing TSX instructions.
- The attack is possible because of shared address space between kernel and usermode. Hypervisor has a separate address space, and therefore it is not prone to similar attacks.