KCFI: Add KCFI arity indicator support #138368

rcvalle · 2025-03-11T19:08:08Z

Adds KCFI arity indicator support to the Rust compiler (see #138311, llvm/llvm-project#121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).

rustbot · 2025-03-11T19:08:13Z

r? @cjgillot

rustbot has assigned @cjgillot.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2025-03-11T19:08:15Z

Some changes occurred in tests/ui/sanitizer

cc @rust-lang/project-exploit-mitigations, @rcvalle

Some changes occurred in tests/codegen/sanitizer

cc @rust-lang/project-exploit-mitigations, @rcvalle

ojeda · 2025-03-11T19:37:13Z

Thanks Ramon for the very quick PR!

I think the matching LLVM PR is llvm/llvm-project#121070, not the other one, right?

Also, would it make sense to add a quick check for the instructions generated, so that we are sure LLVM is actually doing something? e.g. one of the ones from https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/X86/kcfi-arity.ll

rcvalle · 2025-03-11T20:21:48Z

I think the matching LLVM PR is llvm/llvm-project#121070, not the other one, right?

Thanks for pointing that out! Sorry, I confused them. Fixed.

Also, would it make sense to add a quick check for the instructions generated, so that we are sure LLVM is actually doing something? e.g. one of the ones from https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/X86/kcfi-arity.ll

For module flags, we usually add tests up to to verifying that flags are added only because the side effects of adding them are not implemented in the Rust compiler (so we don't need to keep track of changes and update tests when changes are made in LLVM). Would that be okay for you?

maurer · 2025-03-11T20:31:18Z

Do we want to add a check when this flag is set that the LLVM is new enough? I think LLVM will silently ignore unknown module tags, so as is, this could result in someone setting -Zsanitizer-kcfi-arity with a rustc that can't actually produce the new-style tags, and being confused by tags that are incompatible with a clang that has that features enabled.

ojeda · 2025-03-11T21:12:52Z

For module flags, we usually add tests up to to verifying that flags are added only because the side effects of adding them are not implemented in the Rust compiler (so we don't need to keep track of changes and update tests when changes are made in LLVM). Would that be okay for you?

Up to the Rust compiler team, of course -- I am not sure what their policy is.

In any case, to be clear, I only meant it as a smoke test, rather than testing everything that LLVM does. In other words, just to be sure that LLVM actually did something with the module flag, rather than replicating every test in LLVM.

From a quick test, it seems LLVM (well, at least llc in Compiler Explorer) does not say anything if there is a typo in the module flag in the IR.

So what I thought is to check e.g. that we get, say, a mov edx, ... with the flag instead of a mov eax, ... without the flag.

davidtwco · 2025-03-12T18:08:24Z

r? @davidtwco

rcvalle · 2025-03-12T23:57:53Z

Do we want to add a check when this flag is set that the LLVM is new enough? I think LLVM will silently ignore unknown module tags, so as is, this could result in someone setting -Zsanitizer-kcfi-arity with a rustc that can't actually produce the new-style tags, and being confused by tags that are incompatible with a clang that has that features enabled.

Yes, I think it's a good idea. Done. It seems it's the first time such a check is added for an LLVM-related option so I hope I did it right.

P.S. the implementation chosen for the KCFI arity indicator was to use different registers in the mov instruction as the arity indicator instead of augmenting the KCFI tag with this information. This is entirely implemented in LLVM and visible only in the target architecture assembly (and that is why it's tricky and probably not the right place to test it in the Rust compiler tests).

rcvalle · 2025-03-13T00:15:46Z

From a quick test, it seems LLVM (well, at least llc in Compiler Explorer) does not say anything if there is a typo in the module flag in the IR.

So what I thought is to check e.g. that we get, say, a mov edx, ... with the flag instead of a mov eax, ... without the flag.

For non module flags, we test option's side effects (which are actually implemented in the Rust compiler, such as CFI/KCFI checks) in the LLVM assembly, not the target architecture assembly, and since the implementation for the KCFI arity indicator uses different registers in a mov operation in the target architecture assembly as the arity indicator, we'd have to go even further out from what we usually do module flags (which is just test that the module flag was emitted and emitted correctly, see #129373 as an example), and test the target architecture assembly after the KCFI pass is applied and the target architecture assembly is emitted. I'll see if there is a good way to do a smoke test for this, but I still think this a contract and that it's LLVM's responsibility to test the actual side effects of the module flag-- we'd still caught with out test if we didn't pass or if there was a typo in the module flag.

ojeda · 2025-03-13T01:38:34Z

This is entirely implemented in LLVM and visible only in the target architecture assembly (and that is why it's tricky and probably not the right place to test it in the Rust compiler tests).

For non module flags, we test option's side effects (which are actually implemented in the Rust compiler, such as CFI/KCFI checks) in the LLVM assembly, not the target architecture assembly,

Hmm... I am not sure what you mean. Perhaps there is a different policy for LLVM module flags vs. other things like LLVM attributes, but the Rust compiler has assembly tests in tests/assembly/.

For instance, when I added -Zno-jump-tables and -Zfunction-return, I added both an LLVM IR level test as well as an asm level test: 2d47622 ("Add -Zfunction-return={keep,thunk-extern} option"), a65ec44 ("Add -Zno-jump-tables").

Not doing anything is not the end of the world, but it means it is fairly easy for LLVM to e.g. rename the module flags and we wouldn't notice until someone builds the kernel with that compiler and KCFI etc. enabled, as far as I understand, and then it is a mess on the kernel side to add Kconfigs to workaround it.

If adding the test is complex, then sure, we can avoid it. But I don't understand what the issue is adding it. I may be missing a policy thing here, but if it is just the technical side, what is the issue?

In fact, for similar reasons, I would suggest adding one to the feature the PR you linked adds.

rcvalle · 2025-03-13T04:24:56Z

Hmm... I am not sure what you mean. Perhaps there is a different policy for LLVM module flags vs. other things like LLVM attributes, but the Rust compiler has assembly tests in tests/assembly/.

At least that is my understanding (i.e., to not duplicate LLVM tests and perform tests in LLVM assembly as much as possible). (Notice there are no sanitizer tests in there.)

For instance, when I added -Zno-jump-tables and -Zfunction-return, I added both an LLVM IR level test as well as an asm level test: 2d47622 ("Add -Zfunction-return={keep,thunk-extern} option"), a65ec44 ("Add -Zno-jump-tables").

Not doing anything is not the end of the world, but it means it is fairly easy for LLVM to e.g. rename the module flags and we wouldn't notice until someone builds the kernel with that compiler and KCFI etc. enabled, as far as I understand, and then it is a mess on the kernel side to add Kconfigs to workaround it.

Adding it is not the end the world either--I'll just add it :) (But if LLVM starts to arbitrarily changing module flags and like this, I'd argue we'd have much more serious problems.)

In fact, for similar reasons, I would suggest adding one to the feature the PR you linked adds.

Since I'll create the sanitizer/ directory structure there and add a test for this module flag, I'll revisit the other module flags and add similar tests for them accordingly.

ojeda · 2025-03-13T12:28:43Z

At least that is my understanding (i.e., to not duplicate LLVM tests and perform tests in LLVM assembly as much as possible).

Personally, I don't consider those duplication of LLVM tests, but rather smoke tests to ensure things look correct end-to-end regardless of how a backend (or which one, for things that work across several backends) is used.

(Notice there are no sanitizer tests in there.)

There are some for security-related features, so it may be that nobody considered adding some for sanitizers.

(But if LLVM starts to arbitrarily changing module flags and like this, I'd argue we'd have much more serious problems.)

Definitely, but the idea is to detect it as early as possible if it happens -- as far as I understand, LLVM IR is not supposed to be stable.

In other words, if it happens, from the point of view of users, it would probably be rustc's fault (even if LLVM should warn about typos or such changes nevertheless).

Since I'll create the sanitizer/ directory structure there and add a test for this module flag, I'll revisit the other module flags and add similar tests for them accordingly.

Thanks, I really appreciate that. I think it should not cost much CI time, and it may make you/us sleep better... :)

rcvalle · 2025-03-13T18:58:55Z

tests/assembly/sanitizer/kcfi/emit-arity-indicator.rs

+    // CHECK-NEXT:  nop
+    // CHECK-NEXT:  nop
+    // CHECK-NEXT:  nop
+    // CHECK-NEXT:  mov eax, 653723426


@ojeda @scottconstable I think I've it mostly done, but I suspect this test isn't right. If I understood it correctly, according to the table in llvm/llvm-project#121070 (comment), this operation should use EDX(?) because callers would pass two parameters before calling foo in RDI and RSI(?).

(Note that this isn't actually arity that is being indicated. This function's arity is three (3): two parameters, and a return value.)

Hmm... I am not sure what you mean. Arity is the number of parameters, without the return value, so it should use 2, thus it should use EDX.

From a quick test with Clang, that is the case. I also get the same result if I take rustc's LLVM IR and add the arity module flag.

Are you seeing a different result?

By the way, why don't we test a simpler function, e.g. one that just sums two numbers? i.e. the intention here is to notice the difference between the disabled and enabled cases.

Yes, the return value is not included. You're right. What I was noting there, if I understood it correctly, is that "arity" is being defined there in that PR and table at the ABI level with certain things being excluded such as parameters passed on the stack, which would cause it the "arity" defined there to diverge from the actual arity of a function, but that is not relevant here, I was just pointing it out again as I did when it was initially proposed.

Let me try a simpler pair of functions and get back to you.

I'm still getting EAX used with a new test using a simpler pair of functions with different arity. My LLVM version is

rcvalle@rcvalle:~/rust$ build/x86_64-unknown-linux-gnu/stage1/bin/rustc --verbose --version rustc 1.87.0-dev binary: rustc commit-hash: unknown commit-date: unknown host: x86_64-unknown-linux-gnu release: 1.87.0-dev LLVM version: 20.1.0

Which should include it, right?

Which should include it, right?

I don't think so, upstream LLVM 20.1.0 does not contain it.

So I wanted this one landed in nightly mostly so that the kernel people could test it easily by downloading the nightly, but if rustc's LLVM does not have the patch anyway, that does not help much.

I also don't want to give too much trouble to rustc maintainers applying a patch to their LLVM either just for this.

I will probably build a test toolchain for Peter with this PR applied for the time being.

SG! Let me know if there is anything I can help with in the meantime. If needed, I can also remove the checks and test and get it merged, and add these later. Just let me know!

If you look at the llvm-main tag, you can see that we regularly condition things things to not-yet-released LLVM (e.g. LLVM 21 in this case) so that they start working as soon as released, or if rustc is built against a pre-release LLVM

Will take a look. Thank you! Should we condition this on LLVM 21? Or will there be any minor version release that will have this feature?

I would condition this on LLVM 21 - My understanding of current LLVM version numbers are:

"Version X.0.Y" - Development version that will eventually become version X. Y is mostly meaningless, features may come and go.

"Version X.1.Y" - LLVM Version X, stable/released. Patchlevel Y updated to indicate bugfix backports.

So I'd just condition on the major version, e.g. >= 21.0, so that dev versions start working immediately.

There will be LLVM builds which report 21.0 which do not have support for this, but people using unreleased LLVMs usually know they need to be careful of this kind of thing, and there's not much we can do to help them. This also matches how we treat the .0 versions when dealing with LLVM compatibility issues like removed or changed APIs - we assume that if you're building against a .0 version, you're building against the latest.

Adds KCFI arity indicator support to the Rust compiler (see rust-lang#138311, llvm/llvm-project#121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).

ojeda · 2025-03-14T19:32:45Z

compiler/rustc_codegen_llvm/src/context.rs

+        // KCFI arity requires LLVM 20.1.0 or later.
+        if llvm_version < (20, 1, 0) {


This is not the case -- there is no tag yet in upstream LLVM that contains the feature, sadly.

rustbot assigned cjgillot Mar 11, 2025

rustbot added PG-exploit-mitigations S-waiting-on-review T-compiler labels Mar 11, 2025

rcvalle force-pushed the rust-kcfi-arity branch 2 times, most recently from 9df55b1 to 3cf7741 Compare March 11, 2025 19:12

ojeda mentioned this pull request Mar 11, 2025

rustc wanted features & bugfixes Rust-for-Linux/linux#355

Open

56 tasks