04 Pagetables
04 Pagetables
04 Pagetables
Spring 2022
Lab 4: Page Table Management, due next Monday before lab.
Discussion. This week we will be looking carefully at the various processes involved in mapping of virtual addresses
to locations in physical memory. This mapping is stored as a partial function, encoded in the entries of the page
table. The structure of the page table is a highly constrained tree of blocks. Each leaf in the tree contains a number
of page table entries that translate the addresses associated with a single block in virtual memory.
It is important that this structure be very carefully maintained because the processor depends on being able to
quickly interpret its entries. If we deviate from the format it is very likely that the processor will generate page faults
which, in xv6, are handled by simply killing off the offending process. So: let’s avoid that.
We’ll extend xv6 in three simple ways: we’ll make a simple optimization to how system calls are implemented, we’ll
work on a means for printing out page table structures, and we’ll build in some very simple support for detecting
which pages of a process have been accessed. Necessarily, these are the very first steps in supporting more advanced
optimizations.
The Assignment. We’ll experiment with the mapping of virtual memory by implementing three different kernel
extensions. The basis for this lab is found in the lab4 repository (use your user id rather than 22xyz):
While this repository does not include any of the improvements that you’ve implemented in the last lab, we’ll assume
the techniques you have developed there will help you with implementations this week. If you have questions about
prior work, make sure we talk those through before you attempt these tasks.
Assuming you’re up to speed on the readings (through Chapter 3), here is a workflow that will get you through this
week’s tasks:
The first line of the output displays the pointer passed to vmprint. Each line after that describes a single
valid page table entry. The root block of the tree, remember, is level 2, whose valid entries refer to level 1
blocks whose entries refer to level 0 blocks of “leaf” table entries that describe actual translations. Each
entry—indented in a way to suggest the tree structure—is described by the decimal level and entry numbers,
a hexadecimal representation of the page table entry, a hexadecimal representation of the physical address
derived from the entry, and the metadata that describes the access control associated with that page. Thus,
in the example above, the root node has two valid entries—entry 0 and entry 255. Entry 0 has, itself, a single
entry 0 at level 1 and three leaf nodes representing translations associated with virtual pages 0, 1, and 2.
When you are finished, the page table for the first process is printed just after the machine boots. The physical
addresses may be different, but the structure of the page table and page metadata will be the same.
3. Our final modification to the system will allow us identify which pages have been accessed on the system. Some
memory management tasks can benefit from information about recently accessed pages. In this part of the lab,
you will add a new feature to xv6 that inspects the access bits in the RISC-V page table. The RISC-V hardware
page table walker marks these bits in the leaf PTE whenever it resolves a TLB miss.
Your job is to implement pgaccess(), a system call that reports which pages have been recently accessed. The
system call takes three arguments. First, it takes the starting virtual address of the first user page to check.
Second, it takes the number of pages to check. Finally, it takes a pointer to a buffer to store the results into.
This buffer is a bitmask, a data structure that uses one bit per page, where the first page corresponds to the
least significant bit.
Here is an approach to implementing pgaccess():
(a) We have started the process of declaring the pgaccess() system call. We have reserved a call number
in kernel/syscall.h and developed linkage in kernel/syscall.c, and made the appropriate userland
declarations.
(b) Start by implementing sys pgaccess() in kernel/sysproc.c. Remember, this routine will need to use
argint() and argaddr() to read its arguments. Develop your bitmask in a temporary buffer. Then call
copyout() to transfer the results back to the user.
(c) Your experience with mex will likely help.
(d) You must support the scanning of 32 pages, but you can set an upper bound to any reasonable number,
if that is helpful.
(e) The walk() routine in kernel/vm.c can help with finding the appropriate page table entries.
(f) We have added the definition for PTE A, the page table entry bit in the metadata that is set whenever the
hardware page table walker encounters the page table entry.
(g) As you collect the page access bits, you should clear the bits so that you can detect later accesses. If you
don’t do this, the PTE A bit is set permanently.
(h) You may find that your vmprint() routine will be helpful in debugging sys pgaccess().
4. When you are have implemented your page table extensions, you will find the user/pgtabletest program will
be helpful in performing basic tests of your extensions. Make sure you add, commit, and push your changes
for review and grading.
Thought Questions. Please think about the following questions in preparation for meeting in small groups: