Understanding !PTE , Part 1: Let’s get physical
Hello. It’s Ryan Mangipano again (Ryanman). Today’s blog will be the first in a multi-part post designed to help you understand the output of the !PTE debuger comand along with the basics of hardware Virtual Addressing. To better understand Virtual Addressing, we will use the debugger to manually translate a 4-KByte Page Table PAE virtual address into the actual physical addresses in order to understand what !PTE is displaying. I’ll provide relevant information about Virtual Addresses and Virtual Memory along the way.
We’ll start by translating a non-prototype valid hardware VA from an x86 PAE system
The actual process of manually decoding a virtual addresses is going to vary according to the architecture (x86, x64), size of the page, whether or not the virtual address is a large page, whether the page is marked as valid, whether it’s a hardware or software PTE, and whether PAE is enabled. For simplicity, we will not be going over the table entry (PTE/PDE) flags until part two of the blog. For my first example, I am going to demonstrate how to use the information in the processor manuals together with the debugger to decode a valid non-prototype virtual address into the physical memory that it references. You can then try this on your own using windbg and the !pte command to validate your findings.
Finding an address to translate
To start, we'll need to locate a virtual address that maps to a valid PTE. I am going to use the following highlighted virtual address which I found in a memory dump.
f9a12d0c ff155400a1f9 call dword ptr [sfilter+0x4054 (f9a10054)]
Get out your processor manuals
The AMD and Intel manuals both contain helpful reference material on this subject. PDFs of these manuals are available online. Since my CPU is an Intel, I’m going to refer to the Intel manuals.
1: kd> !cpuinfo
CP F/M/S Manufacturer MHz PRCB Signature MSR 8B Signature Features
0 15,6,4 GenuineIntel 3192 0000000400000000 a0073fff
My Intel manuals arrived on CD via snail mail a few days after placing my free order:
https://www.intel.com/products/processor/manuals/order.htm
Is PAE in use?
On this Intel x86 system, we first need to determine if we are using PAE. In the Intel “System Programming section 3.6.1 Paging Options” I found that the PAE (Physical Address Extension) flag can be found in bit 5 of the CR4 register. The PG (Paging) flag in CR0 which enables paging must of course also be set. Register bit numbering starts at zero so it’s the fifth bit from the right. Let’s examine cr4 and convert the value it contains into binary:
1: kd> r cr4
cr4=000006f9
1: kd> .formats 000006f9
Binary: 00000000 00000000 00000110 11111001
You can use the debugger to check for other flags in the same manner
You can use the above method to check for other flags that you find documented in the processor manuals. For example, you can see that bit 4 is also set in the cr4 register output above. This is the Page Size Extensions (PSE) bit which enables large page sizes.
Terminology
Paging is simply a method of dividing up the linear address space into chunks. Pages are simply the name that we give to the chunks that result. The size of these sections is referred to as the Page Size. On x86 systems, the standard page size is 4-KBytes. A Large Page means that the page is larger than the standard size (2MB on PAE x86 or 4MB on non-PAE x86).
Keep the last three
On a standard 4-KByte page size virtual address, the address f9a10054 can be thought of as being split up as follows:
Information needed to locate the base of the page in physical memory f9a10
Offset into that physical page once it is found 054
This means that the last three hexadecimal digits of our physical address will also be 0x054. Once we find the base of our physical page(which will always end in three zeros), we can simply add it to the offset 0x54 and we will have our physical address.
Are there any commands that work with physical addresses?
We will also need a way to work with these physical addresses. Most of us are familiar with using the dd command with virtual addresses. There is also a command that accepts a physical address instead of a virtual address: !dd (notice the ! before the command). There are also variants of the !dd command, such as !db , !du , !dq.
What data do we need to obtain to perform this conversion?
We need to determine how to use the f9a10 portion of the virtual address to find the physical page base. The Memory Management System sets up and maintains a hierarchy of tables that keep track of the mappings between virtual addresses and physical addresses. You will need the following information to traverse the tables yourself and convert this address.
1. Starting point. This will be in the form of a pointer to the base of the first table that you need to check
2. The number of table levels in use on the system and the size of each entry in the tables on our system
3. The offsets into the different tables and the bases of each table.
Once you have all the above information, you can use the debugger to traverse these tables.
Let’s get our plan together for how we are going to obtain these three pieces of information
1. Starting point. This can be obtained from the PDBR. The CR3 register is known as the Page Directory Base Register (PDBR) which points to the physical address of the base of the first table.
2. The number of table levels in use on the system and the size of each entry in the tables on our system. This information can also be obtained from the Intel processor manuals. As previously mentioned, the number of tables, the size of the entries in the tables, the flags, where the bits are split up, and the names of the tables vary according to platform and if you are using PAE or not. All of these, however, use the same basic concepts:
Register pointing to a àTable with entries pointing to a table à with more entries pointing to à another tableà pointing to the physical address of the page in memory.
Table1: By reading the Intel Processor Manual “System Programming: Section 3.8, Physical Address Extension”, I was able to determine that the first table used for x86 PAE virtual addressing is called the Page-Directory-Pointer Table. It’s a table with only 4 Entries that are 64 bits (which is 8 bytes) wide. Each entry is referred to as a Page Directory Pointer Entry and abbreviated as PPE, PDP or PDPE depending on the source. These entries provide the index into the page directory, which is known as the Page Directory Index (PDI). One of these four pointers will lead you to the physical address of the base of the next table that you need to visit in the x86 PAE hierarchy. Just like the pointer that we used from the CR3 register, some of the bits in these table entries are not used as part of the index (referred to as a pointer in some documentation). We will grab the relevant bits and add the appropriate number of zeros to the index to obtain our physical address pointer. I will cover what these other bits are used for in part two of this blog. We must substitute zeros for these bits.
Table2: The table at the second level of tables in the x86 PAE hierarchy (which is referenced by the pointers in Table Level One) is called the Page Directory Table. Don’t confuse this with the Page-Directory-PointerTable. Each Page Directory table can hold 512 entries which are 64 bits in size. The entries in this table are called Page Directory Entries (PDE) and they provide Page Table Index (PTI). Just like the last table, these entries contain indexes (which we convert to a pointer by simply adding zeros) to the base of the next table in the hierarchy.
Table3: The last table is referred to as the Page Table. Each Entry in the page table is called a Page Table Entry (PTE) and provides the Page Offset. Just like the last table, not all the bits are used for pointers. Each page table contains up to 512 entries which are also 64 bits in size. Each 64 bits entry in the table contains a pointer to the base of the page in physical memory.
In summary there are 3 levels of tables when using x86 PAE. These are the Page Directory Pointer Table, Page Directory Tables, and the Page Tables.
1. The offsets into the different tables and the bases of each table.
Each table above is a listing of indexes that will be used to locate the base of the next table. However, once we arrive at each table, we will need to know the index or offset into the table in order to know which table to get to next. These offsets into the tables can be obtained from the virtual address itself. Let’s review our virtual address again. However, this time we will break the address down in binary:
Virtual Address: f9a10054
1: kd> .formats 0xf9a10054
Binary: 11111001 10100001 00000000 01010100
Page Directory Pointer Index(PDPI) 11 Index into 1st table (Page Directory Pointer Table)
Page Directory Index(PDI) 111001 101 Index into 2nd table (Page Directory Table)
Page Table Index(PTI) 00001 0000 Index into 3rd table(Page Table)
Byte Index 0000 01010100 0x054, the offset into the physical memory page
So as you can see, the virtual address is nothing more than a bunch of indexes/offsets.
Putting all this data together to find the physical address
Now that we have all the required data, let’s proceed to locate our physical address
1. Obtain our base pointer to the first table. As we discussed earlier, we can obtain this value from the cr3 register. Bits 0-4 of this register are not used for the pointer to the table base. This means that in order to get the base pointer, we will need to replace these 5 least-significant bits with zero. This will result in the table base being located on a physical address that is always aligned to a 32-bit boundary.
Keep in mind that CR3 will have a different value here for each process. You must make sure that you are in the appropriate process context before proceeding. This is because user mode tables are specific to a particular process. Notice that I said processes, not threads. CR3 will not be changed when swapping threads, since each thread in a given process shares the same address space. Tables relating the system address space (kernel mode) are shared between all processes.
We’ll need to dump out the CR3 register (PDBR) in a format where we can view the last 5 bits. As you can see in the .formats output below, the 5 least significant bits are already set to zero. This means that the hexadecimal value located in the cr3 register is a pointer to the base of the first table. So we now have our starting point. Physical address 023406e0 is the base of the first table.
1: kd> .formats @cr3
Hex: 023406e0
Binary: 00000010 00110100 00000110 11100000
The proper way to display the pointer out the value would be to & against the following mask
1: kd> ? 0y11111111111111111111111111100000
Evaluate expression: -32 = ffffffe0
1: kd> .formats (@cr3 & ffffffe0)
Binary: 00000010 00110100 00000110 11100000
You can use the!process command to get and/or verify this value.
1: kd> !process
PROCESS ff981a58 SessionId: 0 Cid: 0d54 Peb: 7ffde000 ParentCid: 0550
DirBase: 023406e0 ObjectTable: e1541510 HandleCount: 30.
You can also obtain this information from the EPROCESS structure.
1: kd> dt nt!_EPROCESS ff981a58 Pcb.DirectoryTableBase
+0x000 Pcb :
+0x018 DirectoryTableBase : [2] 0x23406e0
2. Obtain our base to the second table by finding our index into the first table. The first table (Page Directory Pointer Table) only has 4 entries and each entry is 64 bits wide. We know from the first two bits of the Virtual Address above that our offset into this table is 0y11 (The y tells the debugger the value is binary, instead of eleven). Eleven would be represented in the debugger as 0n11. We can simply multiply the offset (0y11) by the size of each entry and add the result to the base of the table to get our entry. The entries in the table are 8 bytes wide. We shall use sizeof() as shown below to obtain this value. We can pass this math to the !dq command to dump the data at these physical addresses.
1: kd> !dq (@cr3 & 0xffffffe0)+(0y11*@@(sizeof(nt!_MMPTE))) L1
# 23406f8 00000000`05503801
The processor manuals indicate that the first 12 bits of the entry above are not part of the pointer and must be discarded. This will cause alignment on 4KB boundaries. Since each Hex digit above represents 4 bits, this means that we need to change 801 to 000. That gives us our physical address of the base of our second table, 05503000. We will accomplish this in the next step by ANDing our PDE against a mask.
3. Obtain our base to the third table by finding our index into the second table. The Second Table (Page Directory Table) works the same way, except that this table contains up to 512 entries (on PAE systems). Keep in mind that there can be more than one Page Directory Tables for each process on a PAE system, however we are only concerned with the one that contains the data relating to our virtual address. We know from the virtual address that the offset into this table is 0y111001101. Calculate the address in the same manner as before. We will also need to set the last 12 bits to zero, just like before.
1: kd> !dq (0x05503801 & 0xFFFFFF000) +( 0y111001101*@@(sizeof(nt! _MMPTE))) L1
# 5503e68 00000000`0102d963
The last three hexadecimal digits must be changed to zero, since they are not part of the pointer. This gives us the base address of our last table, the page table, 0x0102d000
4. Find the base of the physical page our memory resides in by finding our index into the third table. Using the base of the page table from the previous step, let’s add the index into this table that we obtained from the virtual address, 0y000010000. The last three digits need to be set to zero.
1: kd> !dq (0x102d963&0xFFFFFF000)+(0y000010000*@@(sizeof(nt! _MMPTE))) L1
# 102d080 00000000`02010121
5. So now we have the physical address of the base of the page, 2010000 Add the base of the page to our offset from the virtual address, 0x054.
1: kd> ? (2010121 & 0xFFFFFF000) +0x054
Evaluate expression: 33620052 = 02010054
A shortcut is to simply change the 0x121 to 0x054 in the previous step.
6. Now, let’s dump out the data at our physical address and the virtual address
1: kd> !dd (2010121 & 0xFFFFFF000) +0x054 L2
# 2010054 804ef09c 804ef12c
1: kd> dd 0xf9a10054 L2
f9a10054 804ef09c 804ef12c
You can see that the data displayed by the two commands is the same.
7. Now let’s dump out the virtual address using the !PTE extension. Notice the values that it provides you with. Look above and compare the values displayed above to what is displayed below. You should now understand what the highlighted and bolded fields mean.
1: kd> !pte 0xf9a10054
VA f9a10054
PDE at 00000000C0603E68 PTE at 00000000C07CD080
contains 000000000102D963 contains 0000000002010121
pfn 102d -G-DA--KWEV pfn 2010 -G--A—KREV
The virtual addresses in italics represent the virtual address of the PDE (Page Directory Entry) and the PTE (Page Table Entry). Also, please note that PFN represents Page Frame Number. PFN is the term used to describe what I referred to as “pointers to the base of the next table” in the hierarchy. This is because it really isn’t a pointer; it’s an index into the table.
Hopefully, the output of !PTE makes a lot of sense to you now. In part two of this blog, I’ll discuss what the PDE/PTE flags (-G-DA—KWEV) represent and provide an example of manual conversion of x86 PAE Large Page Virtual Addresses to Physical.
Comments
Anonymous
March 11, 2010
very good info. I hope it is time for next installment. --rcAnonymous
May 07, 2012
We actually use very similar logic in our product to find these entries. As we port our driver to Win8 there seems to be a change that will not let us access PDE and PTE memory though. Are there changes we need to be aware of? [Hi David. The PDE and PTE structures have obviously not changed, as they must match the hardware architecture. Regarding access to these structures, I'm not aware of any announced changes that would affect this. You may need to debug the failure to see what is preventing access.]Anonymous
March 02, 2015
Hi, Is it possible (under memory duress) that the PDB that is stored on CR3 for a process, might change between the swaps, or is it guaranteed that the CR3 value for a process will not change during it's life cycle? Thanks, [A process's DirBase is not expected to change.]