05 Fullvirt
05 Fullvirt
05 Fullvirt
Computing
Guest application
Guest VM physical memory
ring 1 Guest OS
1. ioctl call to run VM
6. VMM kernel driver or
ring 0 userspace process handle 4. Privileged actions
exits trap to VMM
2. World switch to VMM context
VMM
VMM kernel driver
(guest OS traps
(Host OS) 5. VMM switches back to host on here)
interrupts, I/O requests etc.
(Some traps handled by VMM without world switch)
Host and VMM contexts
• Each context has separate
page tables, CPU registers, World switch
IDTs and so on
• VMM context: VMM Host Memory page of
Guest
occupies top 4MB of user processes world switch user processes
address space code/data/context
mapped by both
• Memory page containing page tables
code/data of world switch code
mapped in both contexts data
• Host/VMM context VMM kernel driver context Guest OS
saved/restored in this
special “cross” page by VMM Host OS VMM
Understand difference with QEMU/KVM
• Where is context saved?
• Common cross page mapped into both host and guest address spaces
• KVM: Common memory (VMCS) accessible by CPU in both contexts via special
instructions
• Privilege level of guest OS?
• Guest OS runs in ring 1 (lower privilege). Instructions that do not run correctly at
lower privilege level are suitably translated to trap to VMM
• KVM: Guest OS runs in VMX ring 0. Some privileged instructions trigger exit to KVM
• How to trap to VMM?
• VMM is located in top 4MB of guest address space , guest OS traps to VMM for
privileged ops. World switch to host if VMM cannot handle trap in guest context
• KVM: VMM is not in guest context, guest traps to VMM in host via VM exit
Binary translation
• Guest OS binary is translated instruction-by-
Guest
instruction and stored in translation cache (TC)
user processes
• Part of VMM memory
ring 3
• Most code stays same, unmodified
• OS code modified to work correctly in ring 1
• Sensitive but unprivileged instructions modified to trap
Guest OS
• Guest OS code executes from TC in ring 1
• Privileged OS code traps to VMM
Translation
• E.g., I/O, set IDT, set CR3, other privileged ops cache (TC)
• Emulated in VMM context or by switching to host ring 1
• VMM sets sensitive data structures like IDT etc. VMM
(maintains shadow copies) ring 0
Dynamic binary translation
• VMM translator logic (ring 0) translates
guest code one basic block at a time to Guest user
produce a compiled code fragment (CCF)
• Basic block = sequence of instructions until a Basic block
Basic block
jump/return
• Once CCF is created, move to ring 1 to
run translated guest code Guest OS
• Once CCF ends, “call out” to VMM logic,
compute next instruction to jump to,
translate, run CCF, and so on CCF CCF