-
Notifications
You must be signed in to change notification settings - Fork 879
Conversation
07c0d78
to
d938351
Compare
Thanks a lot for implementing this massive feature and writing the nice cover page! It seems you've got single stepping working, which is great. Sorry I didn't have time to try or review this PR today due to other tasks, but I'll try to give you some feedback tomorrow. Just two quick questions:
|
It works, as long as you don't enable the By disabling the
On the HAXM-side they are supported, since this feature just overrides guest drN registers. |
@@ -1132,17 +1132,35 @@ void vcpu_save_host_state(struct vcpu_t *vcpu) | |||
save_host_msr(vcpu); | |||
|
|||
hstate->hcr2 = get_cr2(); | |||
hstate->dr0 = get_dr0(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the time, the vCPU is not in debugging mode (!(vcpu->debug_control & HAX_DEBUG_ENABLE)
), so we don't need to save/load debug registers. At least, we can ignore them if no HW BP is enabled. And if we want to be more clever, we can save DR{i} (i = 0, 1, 2, 3) only if HW BP i is enabled. I think this could be an important optimization, because this function is run before every VM entry, so even a few extra instructions could end up having a non-negligible performance penalty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the time, the vCPU is not in debugging mode [...], so we don't need to save/load debug registers
I'm curious about this: What happens after a VM-exit event if the guest has modified dr{0,1,2,3,6}? If "DR exiting" is disabled, this would have side-effects on the host, right?
So saving/loading drN registers should occur if any of these events occur:
- "DR exiting" is disabled.
- "HAX_DEBUG_USE_HW_BP" is enabled.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's indeed a very good point. Should we always enable "DR exiting" though, because:
- If the debugger runs on the host, we can prevent the guest from tampering with DRs.
- If the debugger runs on the guest, we can keep track of the "dirty" DRs we should save/restore. This is more efficient than having "DR exiting" disabled and saving/restore all the DRs at every guest/host context switch.
If you agree with this approach, could you implement it in this PR (maybe as a separate commit)?
@@ -1346,10 +1362,39 @@ void vcpu_load_host_state(struct vcpu_t *vcpu) | |||
|
|||
load_host_msr(vcpu); | |||
set_cr2(hstate->hcr2); | |||
set_dr0(hstate->dr0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make these operations conditional, so as to minimize the performance penalty.
struct vcpu_state_t *state = vcpu->state; | ||
|
||
save_guest_msr(vcpu); | ||
state->_dr0 = get_dr0(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make these operations conditional, so as to minimize the performance penalty.
struct vcpu_state_t *state = vcpu->state; | ||
|
||
load_guest_msr(vcpu); | ||
set_dr0(state->_dr0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make these operations conditional, so as to minimize the performance penalty.
core/vcpu.c
Outdated
@@ -2195,6 +2240,22 @@ static int exit_exc_nmi(struct vcpu_t *vcpu, struct hax_tunnel *htun) | |||
dump_vmcs(vcpu); | |||
break; | |||
} | |||
case VECTOR_DB: { | |||
hax_warning("#DB!\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warning and error messages are visible to the end user, so let's use hax_info
here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this line. It helped me on early stages, but has no use anymore.
core/vcpu.c
Outdated
vcpu->debug_control = 0; | ||
vmwrite(vcpu, GUEST_DR7, 0); | ||
vmwrite(vcpu, GUEST_RFLAGS, | ||
vmread(vcpu, GUEST_RFLAGS) & ~(1 << 8) /* TF */); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
uint64_t rip; | ||
uint64_t dr6; | ||
uint64_t dr7; | ||
} debug; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I wonder if we should add a reserved field. Are there any debugging features we could support in the future that might require passing additional data to QEMU? Problem is we have reached the size of the union
(24 bytes), but we could also use a second tunnel (vcpu->io_buf
) like the fast MMIO interface does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any debugging features we could support in the future that might require passing additional data to QEMU?
I don't think so. On the SDM I've seen other nice features that could benefit debuggers (e.g. tracing), but they can have their own ioctl since they are too wildly different from breakpoints/stepping and considerably more complex.
So I don't think we will ever need more than these 24 bytes.
include/hax_interface.h
Outdated
struct hax_debug_t { | ||
uint32_t control; | ||
uint64_t dr[8]; | ||
} PACKED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PACKED
is ignored for Windows builds, so I suspect there's an implicit uint32_t
inserted after control
. How about we manually declare it as uint32_t reserved
? It'll also give us room for extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we manually declare it as
uint32_t reserved
?
Agreed.
windows/hax_entry.h
Outdated
@@ -163,4 +163,8 @@ extern PDRIVER_OBJECT HaxDriverObject; | |||
#define HAX_VM_IOCTL_NOTIFY_QEMU_VERSION \ | |||
CTL_CODE(HAX_DEVICE_TYPE, 0x910, METHOD_BUFFERED, FILE_ANY_ACCESS) | |||
|
|||
/* API version 5.0 */ | |||
#define HAX_IOCTL_VCPU_DEBUG \ | |||
CTL_CODE(HAX_DEVICE_TYPE, 0x912, METHOD_BUFFERED, FILE_ANY_ACCESS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I skipped 0x912
intentionally when I added the ADD_RAMBLOCK
ioctl, because 0x910
appears twice... So I'd use 0x916
for this one, but that's not a big deal.
windows/hax_entry.h
Outdated
@@ -163,4 +163,8 @@ extern PDRIVER_OBJECT HaxDriverObject; | |||
#define HAX_VM_IOCTL_NOTIFY_QEMU_VERSION \ | |||
CTL_CODE(HAX_DEVICE_TYPE, 0x910, METHOD_BUFFERED, FILE_ANY_ACCESS) | |||
|
|||
/* API version 5.0 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HAXM API is indeed versioned, and the current version is 4 (HAX_CUR_VERSION
). When we update the API, we could bump the API version, but we could also add a "capability flag" (see API.md
for details). I'd actually prefer the latter approach, since we're only adding one ioctl. As an example, please take a look at commit 43ff9f0.
Not at all! Sorry I'm yet to try this PR. It looks like I'll have to delay that for a couple more days... I've done a rough review of the code, and here are my comments so far, but I may revisit a few places later on more carefully :-) |
I've updated the patch. :-) EDIT: As always, I've added them in an extra commit for easier reviewing, but I will squash everything into a single self-contained commit before merging. |
Sorry for the delay in the review. I actually need your help with testing this PR... I had no problem building HAXM or QEMU with your changes, but I can't even get breakpoints working with KVM or TCG, so I haven't seen the "correct" behavior yet :-( Well, a few years back I used to debug the Android Emulator kernel on Ubuntu using QEMU GDB, but I haven't been lucky today. Basically I'm trying to debug the Linux kernel, and I can set breakpoints using a symbol name (e.g. Is it possible for me to reproduce your setup for testing this PR? |
Hi @raphaelning, I will be traveling until August 22th (due DEFCON + other matters), so it's difficult to setup a proper environment to work on HAXM at the moment. Last week I encountered a weird bug on my guest kernel where hardware/software breakpoints were ignored. I fixed it by forcing by forcing #DB/#BP exits in the - exc_bitmap = (1u << VECTOR_MC) | (1u << VECTOR_NM);
+ exc_bitmap = (1u << VECTOR_MC) | (1u << VECTOR_NM) | (1u << VECTOR_DB) | (1u << VECTOR_BP);
+#if 0
if (vcpu->debug_control & (HAX_DEBUG_USE_HW_BP | HAX_DEBUG_STEP)) {
exc_bitmap |= (1u << VECTOR_DB);
}
if (vcpu->debug_control & HAX_DEBUG_USE_SW_BP) {
exc_bitmap |= (1u << VECTOR_BP);
}
+#endif Try this patch if you can, and let me know if it solves your issues with Ubuntu. If so, I could try VNC'ing into my main machine and try to fix the problem. PS: Also, note that if you apply software breakpoints before the kernel is even mapped, the |
Thanks, and no problem at all--I also took some time off and just got back. I did consider the fact that I might not be able to set SW breakpoints before the guest Linux kernel decompresses itself, and that GDB might not be able to detect the correct target architecture before the kernel jumps into 32- (for i386) or 64-bit (for x86_64) mode. So I've already tried setting HW breakpoints, and SW breakpoints well after seeing dmesg output, but the breakpoints still don't get hit... Again, I was testing against KVM/TCG, because I wanted to set up a "control group" first ;-) I'll debug that tomorrow. There's also a very small chance that guest debugging support is broken for both KVM and TCG, so I'll try HAXM+GDB with your patch and hack too. I still wonder if I could use the same guest kernel, QEMU and GDB commands that you use; if not, I'll just stick with the latest Linux kernel and QEMU. |
OK I just got this working. The key is to use an older Linux kernel as the guest, e.g. v3.10: https://github.com/nathanchance/linux-stable/tree/linux-3.10.y For some reason, QEMU GDB just can't break into recent x86 Linux kernels, regardless of the accelerator being used. For the record, here's how to reproduce my setup:
Up to this point, HAXM works just as well as KVM and TCG. I've also verified that single-stepping ( However, I've noticed another bug: whenever I interrupt the guest manually using Ctrl-C, GDB always shows the last HW or SW breakpoint that was hit, instead of the current guest RIP:
|
For this issue: "However, I've noticed another bug: whenever I interrupt the guest manually using Ctrl-C, GDB always shows the last HW or SW breakpoint that was hit, instead of the current guest RIP:" It should be dirty flag setting issue in Qemu, and be a separate issue. I will do more investigation for a formal fix. |
Above Ctrl C issue should be register sync issue. It could be fixed by qemu change. |
@raphaelning @junxiaoc Thanks a lot for the feedback. And sorry for my late reply, but I have been busy for the past several weeks with other matters. I've checked once again my patch and breakpoints/stepping works as expected (no further patches needed). As for the Ctrl+C issue, it seems @junxiaoc's patch for QEMU fixes it, so it's unrelated to this PR. I'll rebase the patch so that there's no conflicts. |
No problem :-) We were unable to reproduce the breakpoints being ignored issue, so we also thought this PR is good to go. Please go ahead with the rebase and git history cleanup, and then we'll merge this PR. |
49c1a88
to
c67b087
Compare
Signed-off-by: Alexandro Sanchez Bach <alexandro@phi.nz>
All done! |
The new DEBUG ioctl is identified by a feature flag, so it's not tied to a specific HAXM API version.
Thanks! I caught a couple of misleading comments, and then realized I could edit them out myself. I hope you're fine with that. We'll run regular tests before doing a squash-merge. |
Hi @raphaelning -- Are the changes @AlexAltea made to QEMU (AlexAltea/orbital-qemu@77d61c7) to enable guest debug planned to be upstreamed to mainline? |
@AlexAltea is the original author of the QEMU-side patch, but he's okay with us upstreaming it. Our plan is to improve the HAXM-side patch and include it in a new HAXM release (7.4.0) before upstreaming the QEMU-side patch. @junxiaoc is working on a patch to conditionally save/restore DRs and monitor guest writes to DRs (left as TODOs by this PR). He and I have also rebased the QEMU-side patch onto QEMU 3.0.0 and fixed a few issues. If you'd like to try them, @junxiaoc can attach them here. |
This change is to improve robustness and performance for debug feature(#81). DR registers are loaded to guest only if they are updated or current thread is migrated to other CPU core. Host DRs will not be saved and restored unless host process (QEMU) is debugged with hardware breakpoint. When a VCPU is being debugged by the host using HW breakpoint, prevent its DR registers from being tampered with by the guest OS or by SET_REGS ioctl. Verified: - Guest gdb only; - QEMU gdb (debugging guest from host) only; - Starting guest gdb, then using QEMU gdb to debug; - Host debugging QEMU + Guest gdb; - Host debugging QEMU + QEMU gdb; - Host debugging QEMU + QEMU gdb + Guest gdb QEMU gdb works, and guest gdb reports app crash. Please refer to known issues Known issue: - Guest gdb might report app crash When QEMU gdb is enabled This is probabaly because existed guest DR state is discarded as soon as QEMU gdb enables hardware breakpoint. Change-Id: I872ba70b2bc10a11a37425e4e0aa1adad20d093d Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
This change is to improve robustness and performance for debug feature(#81). DR registers are loaded to guest only if they are updated or current thread is migrated to other CPU core. Host DRs will not be saved and restored unless host process (QEMU) is debugged with hardware breakpoint. When a VCPU is being debugged by the host using HW breakpoint, prevent its DR registers from being tampered with by the guest OS or by SET_REGS ioctl. Verified: - Guest gdb only; - QEMU gdb (debugging guest from host) only; - Starting guest gdb, then using QEMU gdb to debug; - Host debugging QEMU + Guest gdb; - Host debugging QEMU + QEMU gdb; - Host debugging QEMU + QEMU gdb + Guest gdb QEMU gdb works, and guest gdb reports app crash. Please refer to known issues Known issue: - Guest gdb might report app crash When QEMU gdb is enabled This is probabaly because existed guest DR state is discarded as soon as QEMU gdb enables hardware breakpoint. Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
This change is to improve robustness and performance for debug feature(#81). DR registers are loaded to guest only if they are updated or current thread is migrated to other CPU core. Host DRs will not be saved and restored unless host process (QEMU) is debugged with hardware breakpoint. When a VCPU is being debugged by the host using HW breakpoint, prevent its DR registers from being tampered with by the guest OS or by SET_REGS ioctl. Verified: - Guest gdb only; - QEMU gdb (debugging guest from host) only; - Starting guest gdb, then using QEMU gdb to debug; - Host debugging QEMU + Guest gdb; - Host debugging QEMU + QEMU gdb; - Host debugging QEMU + QEMU gdb + Guest gdb QEMU gdb works, and guest gdb reports app crash. Please refer to known issues Known issue: - Guest gdb might report app crash When QEMU gdb is enabled This is probabaly because existed guest DR state is discarded as soon as QEMU gdb enables hardware breakpoint. Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
patches_debug_feature_in_qemu.zip Attached are qemu patches which includes "Ctrl C" fix, Mac fix, and patch which rebases to qemu 3.0.0. |
This implements the debugging features proposed at #66. More information and questions can be found on that thread.
Motivation
Guest debugging is extremely relevant to debugging bootloaders/microkernels, or in my case, kernels that do not include debugging backends. Aside from the intrinsic value of this feature, it's also really beneficial to developers/researchers as neither WHPX (Windows) nor HVF (macOS) support guest debugging and while they remain closed-source this will be probably hard to change.
Design
Interface
HAX_VCPU_IOCTL_DEBUG
ioctl (i.e. QEMU-to-HAXM) with a structure containing:This is meant to manage the guest state accordingly: e.g. if guest debugging via hardware breakpoints is enabled we should protect the guest drN values in the
exit_dr_access
handler. Additionally, to know where to redirect #BP, e.g. if software breakpoints are enabled forward to QEMU, otherwise forward to VM.They already are a self-contained way of describing hardware breakpoints, so no need to wrap that information in another layer of abstraction.
HAX_EXIT_DEBUG
exit (i.e. HAXM-to-QEMU) with a structure containing:Features
int 3h
(0xCC). Their lifecycle is:break *
command.hax_sw_brakpoint
object and patches instruction withint 3h
.#BP
.continue
command.hbreak *
command.#DB
, providing the value for dr6.continue
command.si
orni
command.HAX_DEBUG_STEP
debug control flag with the previous ioctl.#DB
, caused by the trap flag.HAX_DEBUG_STEP
debug control flag with the previous ioctl.Documentation
Although the description above might serve as a rough overview to review this patch, I haven't properly documented the interface changes on API.md. We can either leave that for a future PR, or if you want that already, let me know if you agree with the current proposal and I'll start documenting everything right away.
Samples
Since this patch adds a new feature for userland applications, we probably should provide some example. QEMU's HAXM accelerator seems to be the reference implementation, so I've added the debugging features to my QEMU fork at AlexAltea/orbital-qemu@77d61c7. Please feel free to upstream these changes to QEMU if you consider it appropriate.