Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KCFI: Add KCFI arity indicator support #138368

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rcvalle
Copy link
Member

@rcvalle rcvalle commented Mar 11, 2025

@rustbot
Copy link
Collaborator

rustbot commented Mar 11, 2025

r? @cjgillot

rustbot has assigned @cjgillot.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added PG-exploit-mitigations Project group: Exploit mitigations S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 11, 2025
@rustbot
Copy link
Collaborator

rustbot commented Mar 11, 2025

Some changes occurred in tests/ui/sanitizer

cc @rust-lang/project-exploit-mitigations, @rcvalle

Some changes occurred in tests/codegen/sanitizer

cc @rust-lang/project-exploit-mitigations, @rcvalle

@rcvalle rcvalle force-pushed the rust-kcfi-arity branch 2 times, most recently from 9df55b1 to 3cf7741 Compare March 11, 2025 19:12
@ojeda
Copy link
Contributor

ojeda commented Mar 11, 2025

Thanks Ramon for the very quick PR!

I think the matching LLVM PR is llvm/llvm-project#121070, not the other one, right?

Also, would it make sense to add a quick check for the instructions generated, so that we are sure LLVM is actually doing something? e.g. one of the ones from https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/X86/kcfi-arity.ll

@rust-log-analyzer

This comment has been minimized.

@rcvalle rcvalle force-pushed the rust-kcfi-arity branch 2 times, most recently from e84b08f to 60df964 Compare March 11, 2025 20:17
@rcvalle
Copy link
Member Author

rcvalle commented Mar 11, 2025

I think the matching LLVM PR is llvm/llvm-project#121070, not the other one, right?

Thanks for pointing that out! Sorry, I confused them. Fixed.

Also, would it make sense to add a quick check for the instructions generated, so that we are sure LLVM is actually doing something? e.g. one of the ones from https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/X86/kcfi-arity.ll

For module flags, we usually add tests up to to verifying that flags are added only because the side effects of adding them are not implemented in the Rust compiler (so we don't need to keep track of changes and update tests when changes are made in LLVM). Would that be okay for you?

@maurer
Copy link
Contributor

maurer commented Mar 11, 2025

Do we want to add a check when this flag is set that the LLVM is new enough? I think LLVM will silently ignore unknown module tags, so as is, this could result in someone setting -Zsanitizer-kcfi-arity with a rustc that can't actually produce the new-style tags, and being confused by tags that are incompatible with a clang that has that features enabled.

@rust-log-analyzer

This comment has been minimized.

@ojeda
Copy link
Contributor

ojeda commented Mar 11, 2025

For module flags, we usually add tests up to to verifying that flags are added only because the side effects of adding them are not implemented in the Rust compiler (so we don't need to keep track of changes and update tests when changes are made in LLVM). Would that be okay for you?

Up to the Rust compiler team, of course -- I am not sure what their policy is.

In any case, to be clear, I only meant it as a smoke test, rather than testing everything that LLVM does. In other words, just to be sure that LLVM actually did something with the module flag, rather than replicating every test in LLVM.

From a quick test, it seems LLVM (well, at least llc in Compiler Explorer) does not say anything if there is a typo in the module flag in the IR.

So what I thought is to check e.g. that we get, say, a mov edx, ... with the flag instead of a mov eax, ... without the flag.

@davidtwco
Copy link
Member

r? @davidtwco

@rustbot rustbot assigned davidtwco and unassigned cjgillot Mar 12, 2025
@rcvalle
Copy link
Member Author

rcvalle commented Mar 12, 2025

Do we want to add a check when this flag is set that the LLVM is new enough? I think LLVM will silently ignore unknown module tags, so as is, this could result in someone setting -Zsanitizer-kcfi-arity with a rustc that can't actually produce the new-style tags, and being confused by tags that are incompatible with a clang that has that features enabled.

Yes, I think it's a good idea. Done. It seems it's the first time such a check is added for an LLVM-related option so I hope I did it right.

P.S. the implementation chosen for the KCFI arity indicator was to use different registers in the mov instruction as the arity indicator instead of augmenting the KCFI tag with this information. This is entirely implemented in LLVM and visible only in the target architecture assembly (and that is why it's tricky and probably not the right place to test it in the Rust compiler tests).

@rcvalle
Copy link
Member Author

rcvalle commented Mar 13, 2025

From a quick test, it seems LLVM (well, at least llc in Compiler Explorer) does not say anything if there is a typo in the module flag in the IR.

So what I thought is to check e.g. that we get, say, a mov edx, ... with the flag instead of a mov eax, ... without the flag.

For non module flags, we test option's side effects (which are actually implemented in the Rust compiler, such as CFI/KCFI checks) in the LLVM assembly, not the target architecture assembly, and since the implementation for the KCFI arity indicator uses different registers in a mov operation in the target architecture assembly as the arity indicator, we'd have to go even further out from what we usually do module flags (which is just test that the module flag was emitted and emitted correctly, see #129373 as an example), and test the target architecture assembly after the KCFI pass is applied and the target architecture assembly is emitted. I'll see if there is a good way to do a smoke test for this, but I still think this a contract and that it's LLVM's responsibility to test the actual side effects of the module flag-- we'd still caught with out test if we didn't pass or if there was a typo in the module flag.

@rust-log-analyzer

This comment has been minimized.

@ojeda
Copy link
Contributor

ojeda commented Mar 13, 2025

This is entirely implemented in LLVM and visible only in the target architecture assembly (and that is why it's tricky and probably not the right place to test it in the Rust compiler tests).

For non module flags, we test option's side effects (which are actually implemented in the Rust compiler, such as CFI/KCFI checks) in the LLVM assembly, not the target architecture assembly,

Hmm... I am not sure what you mean. Perhaps there is a different policy for LLVM module flags vs. other things like LLVM attributes, but the Rust compiler has assembly tests in tests/assembly/.

For instance, when I added -Zno-jump-tables and -Zfunction-return, I added both an LLVM IR level test as well as an asm level test: 2d47622 ("Add -Zfunction-return={keep,thunk-extern} option"), a65ec44 ("Add -Zno-jump-tables").

Not doing anything is not the end of the world, but it means it is fairly easy for LLVM to e.g. rename the module flags and we wouldn't notice until someone builds the kernel with that compiler and KCFI etc. enabled, as far as I understand, and then it is a mess on the kernel side to add Kconfigs to workaround it.

If adding the test is complex, then sure, we can avoid it. But I don't understand what the issue is adding it. I may be missing a policy thing here, but if it is just the technical side, what is the issue?

In fact, for similar reasons, I would suggest adding one to the feature the PR you linked adds.

@rust-log-analyzer

This comment has been minimized.

@rcvalle
Copy link
Member Author

rcvalle commented Mar 13, 2025

Hmm... I am not sure what you mean. Perhaps there is a different policy for LLVM module flags vs. other things like LLVM attributes, but the Rust compiler has assembly tests in tests/assembly/.

At least that is my understanding (i.e., to not duplicate LLVM tests and perform tests in LLVM assembly as much as possible). (Notice there are no sanitizer tests in there.)

For instance, when I added -Zno-jump-tables and -Zfunction-return, I added both an LLVM IR level test as well as an asm level test: 2d47622 ("Add -Zfunction-return={keep,thunk-extern} option"), a65ec44 ("Add -Zno-jump-tables").

Not doing anything is not the end of the world, but it means it is fairly easy for LLVM to e.g. rename the module flags and we wouldn't notice until someone builds the kernel with that compiler and KCFI etc. enabled, as far as I understand, and then it is a mess on the kernel side to add Kconfigs to workaround it.

Adding it is not the end the world either--I'll just add it :) (But if LLVM starts to arbitrarily changing module flags and like this, I'd argue we'd have much more serious problems.)

In fact, for similar reasons, I would suggest adding one to the feature the PR you linked adds.

Since I'll create the sanitizer/ directory structure there and add a test for this module flag, I'll revisit the other module flags and add similar tests for them accordingly.

@rust-log-analyzer

This comment has been minimized.

@ojeda
Copy link
Contributor

ojeda commented Mar 13, 2025

At least that is my understanding (i.e., to not duplicate LLVM tests and perform tests in LLVM assembly as much as possible).

Personally, I don't consider those duplication of LLVM tests, but rather smoke tests to ensure things look correct end-to-end regardless of how a backend (or which one, for things that work across several backends) is used.

(Notice there are no sanitizer tests in there.)

There are some for security-related features, so it may be that nobody considered adding some for sanitizers.

(But if LLVM starts to arbitrarily changing module flags and like this, I'd argue we'd have much more serious problems.)

Definitely, but the idea is to detect it as early as possible if it happens -- as far as I understand, LLVM IR is not supposed to be stable.

In other words, if it happens, from the point of view of users, it would probably be rustc's fault (even if LLVM should warn about typos or such changes nevertheless).

Since I'll create the sanitizer/ directory structure there and add a test for this module flag, I'll revisit the other module flags and add similar tests for them accordingly.

Thanks, I really appreciate that. I think it should not cost much CI time, and it may make you/us sleep better... :)

// CHECK-NEXT: nop
// CHECK-NEXT: nop
// CHECK-NEXT: nop
// CHECK-NEXT: mov eax, 653723426
Copy link
Member Author

@rcvalle rcvalle Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ojeda @scottconstable I think I've it mostly done, but I suspect this test isn't right. If I understood it correctly, according to the table in llvm/llvm-project#121070 (comment), this operation should use EDX(?) because callers would pass two parameters before calling foo in RDI and RSI(?).

(Note that this isn't actually arity that is being indicated. This function's arity is three (3): two parameters, and a return value.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I am not sure what you mean. Arity is the number of parameters, without the return value, so it should use 2, thus it should use EDX.

From a quick test with Clang, that is the case. I also get the same result if I take rustc's LLVM IR and add the arity module flag.

Are you seeing a different result?

By the way, why don't we test a simpler function, e.g. one that just sums two numbers? i.e. the intention here is to notice the difference between the disabled and enabled cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the return value is not included. You're right. What I was noting there, if I understood it correctly, is that "arity" is being defined there in that PR and table at the ABI level with certain things being excluded such as parameters passed on the stack, which would cause it the "arity" defined there to diverge from the actual arity of a function, but that is not relevant here, I was just pointing it out again as I did when it was initially proposed.

Let me try a simpler pair of functions and get back to you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still getting EAX used with a new test using a simpler pair of functions with different arity. My LLVM version is

rcvalle@rcvalle:~/rust$ build/x86_64-unknown-linux-gnu/stage1/bin/rustc --verbose --version
rustc 1.87.0-dev
binary: rustc
commit-hash: unknown
commit-date: unknown
host: x86_64-unknown-linux-gnu
release: 1.87.0-dev
LLVM version: 20.1.0

Which should include it, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which should include it, right?

I don't think so, upstream LLVM 20.1.0 does not contain it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I wanted this one landed in nightly mostly so that the kernel people could test it easily by downloading the nightly, but if rustc's LLVM does not have the patch anyway, that does not help much.

I also don't want to give too much trouble to rustc maintainers applying a patch to their LLVM either just for this.

I will probably build a test toolchain for Peter with this PR applied for the time being.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG! Let me know if there is anything I can help with in the meantime. If needed, I can also remove the checks and test and get it merged, and add these later. Just let me know!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the llvm-main tag, you can see that we regularly condition things things to not-yet-released LLVM (e.g. LLVM 21 in this case) so that they start working as soon as released, or if rustc is built against a pre-release LLVM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look. Thank you! Should we condition this on LLVM 21? Or will there be any minor version release that will have this feature?

Copy link
Contributor

@maurer maurer Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would condition this on LLVM 21 - My understanding of current LLVM version numbers are:

  • "Version X.0.Y" - Development version that will eventually become version X. Y is mostly meaningless, features may come and go.
  • "Version X.1.Y" - LLVM Version X, stable/released. Patchlevel Y updated to indicate bugfix backports.

So I'd just condition on the major version, e.g. >= 21.0, so that dev versions start working immediately.

There will be LLVM builds which report 21.0 which do not have support for this, but people using unreleased LLVMs usually know they need to be careful of this kind of thing, and there's not much we can do to help them. This also matches how we treat the .0 versions when dealing with LLVM compatibility issues like removed or changed APIs - we assume that if you're building against a .0 version, you're building against the latest.

Comment on lines +334 to +335
// KCFI arity requires LLVM 20.1.0 or later.
if llvm_version < (20, 1, 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the case -- there is no tag yet in upstream LLVM that contains the feature, sadly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PG-exploit-mitigations Project group: Exploit mitigations S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants