Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RISC-V intrinsics with one or more target features are not inlined #137293

Open
brkydnc opened this issue Feb 19, 2025 · 9 comments
Open

RISC-V intrinsics with one or more target features are not inlined #137293

brkydnc opened this issue Feb 19, 2025 · 9 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such O-riscv Target: RISC-V architecture T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@brkydnc
Copy link

brkydnc commented Feb 19, 2025

The Problem

The compiler does not inline RISC-V intrinsics that can be enabled with one or more target features (i.e. #[target_feature(enable = "zkne", enable = "zknd")]) unless all of them are present.

Expected Behavior

At least one enabled feature should be enough for the compiler to generate inlined code.

Explanation

RISC-V intrinsics such as sha512sig0 require only one target feature:

#[target_feature(enable = "zknh")]
pub unsafe fn sha512sig0(rs1: u64) -> u64;

When zknh is enabled, the compiler inlines the intrinsic with the appropriate instruction on nightly (https://godbolt.org/z/Kozx3KE11):

example::main::hb1a6ee9bd645943a:
        addi    sp, sp, -16
        sha512sig0      a0, zero
        sd      a0, 8(sp)
        addi    a0, sp, 8
        addi    sp, sp, 16
        ret

Unfortunately, this is not the case for intrinsics that can be enabled by one or more target features. Let's take a look at aes64ks2:

#[target_feature(enable = "zkne", enable = "zknd")]
pub unsafe fn aes64ks2(rs1: u64, rs2: u64) -> u64;

The documentation states that the intrinsic is "safe to use if the zkne OR zknd target feature is present." which means that enabling at least one of them should be enough to inline the intrinsic.

However, the compiler exhibits this behavior only if two of them are enabled at the same time.

The compiler generates a stub if only one of zkne (https://godbolt.org/z/5TxeKo1sG), or zknd (https://godbolt.org/z/KKqz8MWYY) is enabled:

core::core_arch::riscv64::zk::aes64ks2::ha6614eafefa6ee36:
        aes64ks2        a0, zero, zero
        ret

example::main::hb1a6ee9bd645943a:
        addi    sp, sp, -16
        sd      ra, 8(sp)
        call    core::core_arch::riscv64::zk::aes64ks2::ha6614eafefa6ee36
        sd      a0, 0(sp)
        mv      a0, sp
        ld      ra, 8(sp)
        addi    sp, sp, 16
        ret

If they are both enabled, the compiler inlines the intrinsic as expected (https://godbolt.org/z/3fEjfr1v8):

example::main::hb1a6ee9bd645943a:
        addi    sp, sp, -16
        aes64ks2        a0, zero, zero
        sd      a0, 8(sp)
        addi    a0, sp, 8
        addi    sp, sp, 16
        ret

Meta

Here is the output of the compiler:

rustc 1.87.0-nightly (5bc623145 2025-02-16)
binary: rustc
commit-hash: 5bc62314547c7639484481f62f218156697cfef0
commit-date: 2025-02-16
host: x86_64-unknown-linux-gnu
release: 1.87.0-nightly
LLVM version: 19.1.7
@brkydnc brkydnc added the C-bug Category: This is a bug. label Feb 19, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 19, 2025
@brkydnc brkydnc changed the title RISC-V instrinsics with one or more target features are not inlined RISC-V intrinsics with one or more target features are not inlined Feb 19, 2025
@bjorn3
Copy link
Member

bjorn3 commented Feb 20, 2025

I'm pretty sure #[target_feature(enable = "zkne", enable = "zknd")] is interpreted as enabling both the zkne and zknd features within the body of the intrinsic implementation and thus requiring both to be enabled when calling it. I don't think we have support for enabling one or the other feature.

@bjorn3 bjorn3 added the O-riscv Target: RISC-V architecture label Feb 20, 2025
@saethlin saethlin added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Feb 20, 2025
@sayantn
Copy link
Contributor

sayantn commented Feb 20, 2025

We do not have support for ORing target features. So the function requires both zkne and zknd. But if I am not too wrong, this is incorrect behaviour. The riscv manual specifies that this function should be available whenever any one of the features is available.

See the scalar crypto doc for riscv https://github.com/riscv/riscv-crypto/releases/tag/v1.0.1-scalar

cc @Amanieu

@Noratrieb Noratrieb added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Mar 1, 2025
@brkydnc
Copy link
Author

brkydnc commented Mar 4, 2025

Is there any non-trivial reason why Rust does not support this behavior?

@bjorn3
Copy link
Member

bjorn3 commented Mar 4, 2025

When codegening the function

#[target_feature(biskeshed_enable_one_of("zkne", "zknd")]
pub unsafe fn aes64ks2(rs1: u64, rs2: u64) -> u64 {
   // ...
}

which of the two features should be enabled? You have to enable at least one of them to be able to call the inner intrinsic, but there is no way to know which of the two features the caller has enabled. The aes64ks2 function is not guaranteed to be inlined so just inheriting the feature of the function it were to be inlined into doesn't work.

@Amanieu
Copy link
Member

Amanieu commented Mar 4, 2025

ORing features is functionally equivalent to having a new feature called zkne_or_zknd which both zkne and zknd depend on. We could expose this as an explicit feature, or implicitly via some syntax (e.g. target_feature(enable = "zkne|zknd")).

Incidentally this is very similar to rust-lang/compiler-team#778.

@bjorn3
Copy link
Member

bjorn3 commented Mar 4, 2025

Does LLVM have any support for that?

@brkydnc
Copy link
Author

brkydnc commented Mar 4, 2025

Yes, it has the support here and here.

@bjorn3
Copy link
Member

bjorn3 commented Mar 4, 2025

I don't mean if LLVM supports it for declaring intrinsics, but if it supports it when defining functions. If stdarch writes

#![feature(link_llvm_intrinsics, abi_unadjusted)]

unsafe extern "unadjusted" {
    #[link_name = "llvm.riscv.aes64ks2"]
    fn _aes64ks2(rs1: i64, rs2: i64) -> i64;
}

#[target_feature(enable = "zkne", enable = "zknd")]
pub unsafe fn aes64ks2(rs1: u64, rs2: u64) -> u64 {
    _aes64ks2(rs1 as i64, rs2 as i64) as u64
}

that currently results in the following LLVM IR:

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable
define noundef i64 @aes64ks2(i64 noundef %rs1, i64 noundef %rs2) unnamed_addr #0 !dbg !8 {
start:
  %_3 = tail call noundef i64 @llvm.riscv.aes64ks2(i64 noundef %rs1, i64 noundef %rs2) #2, !dbg !19
  ret i64 %_3, !dbg !20
}

; Function Attrs: mustprogress nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i64 @llvm.riscv.aes64ks2(i64, i64) unnamed_addr #1

attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable "target-cpu"="generic-rv64" "target-features"="+m,+a,+f,+d,+c,+zkne,+zknd" }
attributes #1 = { mustprogress nocallback nofree nosync nounwind speculatable willreturn memory(none) }
attributes #2 = { nounwind }

What must be the value of the target-features attribute to enable zkne or zknd but not both like it currently does?

@workingjubilee
Copy link
Member

Is there any non-trivial reason why Rust does not support this behavior?

It seems like the answer is "there's not actually sufficient underlying codegen backend support for doing the thing you think it does".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such O-riscv Target: RISC-V architecture T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants