-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add nvptx_target_feature #138689
base: master
Are you sure you want to change the base?
add nvptx_target_feature #138689
Conversation
r? @wesleywiser rustbot has assigned @wesleywiser. Use |
The job Click to see the possible cause of the failure (guessed by this bot)
|
It looks like the failing build environment is using LLVM-18 (sm_100 onward were added recently llvm/llvm-project#124155). I could remove those entirely (such hardware isn't available yet) or guard them based on LLVM version (how?). |
IIUC LLVM exposes these as target CPUs ( |
("sm_100", Unstable(sym::nvptx_target_feature), &["sm_90a"]), | ||
("sm_100a", Unstable(sym::nvptx_target_feature), &["sm_100"]), | ||
("sm_101", Unstable(sym::nvptx_target_feature), &["sm_100a"]), | ||
("sm_101a", Unstable(sym::nvptx_target_feature), &["sm_101"]), | ||
("sm_120", Unstable(sym::nvptx_target_feature), &["sm_101a"]), | ||
("sm_120a", Unstable(sym::nvptx_target_feature), &["sm_120"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to comment these out until rust supports the required LLVM version.
("sm_100", Unstable(sym::nvptx_target_feature), &["sm_90a"]), | |
("sm_100a", Unstable(sym::nvptx_target_feature), &["sm_100"]), | |
("sm_101", Unstable(sym::nvptx_target_feature), &["sm_100a"]), | |
("sm_101a", Unstable(sym::nvptx_target_feature), &["sm_101"]), | |
("sm_120", Unstable(sym::nvptx_target_feature), &["sm_101a"]), | |
("sm_120a", Unstable(sym::nvptx_target_feature), &["sm_120"]), | |
// TODO: requires LLVM 21+ | |
// ("sm_100", Unstable(sym::nvptx_target_feature), &["sm_90a"]), | |
// ("sm_100a", Unstable(sym::nvptx_target_feature), &["sm_100"]), | |
// ("sm_101", Unstable(sym::nvptx_target_feature), &["sm_100a"]), | |
// ("sm_101a", Unstable(sym::nvptx_target_feature), &["sm_101"]), | |
// ("sm_120", Unstable(sym::nvptx_target_feature), &["sm_101a"]), | |
// ("sm_120a", Unstable(sym::nvptx_target_feature), &["sm_120"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are in LLVM-20, which is what Rust's src/llvm-project
submodule uses (and thus I think all the official builds). The failing CI build environment is using LLVM-18, but I don't know what the convention is for such backward compatibility. I'm fine just commenting these out for now.
@gonzalobg Is it correct for these features to represent a total order, or is there a more general DAG?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For SM Compute Capabilities:
- The non
_a
capabilities extend each other:sm_90
is a subset ofsm_100
. - The
_a
capability of some SM:- Extends the capability of only that one SM:
sm_90
is a subset ofsm_90a
(and thereforesm_80
is a subset ofsm_90a
). - Contains some unique functionality:
sm_90a
is not a sub-set ofsm_100a
(and by extensionsm_100
).
- Extends the capability of only that one SM:
- An SM Compute Capability implies PTX ISA version >=
some_version
(e.g.sm_90
impliesptx>=78
, or said differently,sm_90
can't be used ifptx < 78
).
Each PTX feature (e.g. using a particular instruction) requires a certain SM capability and certain PTX ISA version.
Approximately, the SM capability is a HW constraint, and PTX ISA is a Driver constraint; to use a feature both constraints must be satisfied. This table relates PTX ISA versions, sm capabilities, and driver versions. Each instruction documents which SM capability and PTX ISA version it requires.
I think it'd be good to have more testing (could be punted into an issue and done later):
I think it'd also be good to have more docs:
|
You can address this issue by modifying rust/compiler/rustc_codegen_llvm/src/llvm_util.rs Lines 280 to 285 in c4b38a5
LLVM actually supports Outputs of both --print target-cpus and --print target-features also show them$ rustc --target nvptx64-nvidia-cuda --print target-cpus
Available CPUs for this target:
sm_100
sm_100a
sm_101
sm_101a
sm_120
sm_120a
sm_20
sm_21
sm_30 - This is the default target CPU for the current build target (currently nvptx64-nvidia-cuda).
sm_32
sm_35
sm_37
sm_50
sm_52
sm_53
sm_60
sm_61
sm_62
sm_70
sm_72
sm_75
sm_80
sm_86
sm_87
sm_89
sm_90
sm_90a
$ rustc --target nvptx64-nvidia-cuda --print target-features
Features supported by rustc for this target:
crt-static - Enables C Run-time Libraries to be statically linked.
Code-generation features supported by LLVM for this target:
ptx32 - Use PTX version 32.
ptx40 - Use PTX version 40.
ptx41 - Use PTX version 41.
ptx42 - Use PTX version 42.
ptx43 - Use PTX version 43.
ptx50 - Use PTX version 50.
ptx60 - Use PTX version 60.
ptx61 - Use PTX version 61.
ptx62 - Use PTX version 62.
ptx63 - Use PTX version 63.
ptx64 - Use PTX version 64.
ptx65 - Use PTX version 65.
ptx70 - Use PTX version 70.
ptx71 - Use PTX version 71.
ptx72 - Use PTX version 72.
ptx73 - Use PTX version 73.
ptx74 - Use PTX version 74.
ptx75 - Use PTX version 75.
ptx76 - Use PTX version 76.
ptx77 - Use PTX version 77.
ptx78 - Use PTX version 78.
ptx80 - Use PTX version 80.
ptx81 - Use PTX version 81.
ptx82 - Use PTX version 82.
ptx83 - Use PTX version 83.
ptx84 - Use PTX version 84.
ptx85 - Use PTX version 85.
ptx86 - Use PTX version 86.
ptx87 - Use PTX version 87.
sm_100 - Target SM 100.
sm_100a - Target SM 100a.
sm_101 - Target SM 101.
sm_101a - Target SM 101a.
sm_120 - Target SM 120.
sm_120a - Target SM 120a.
sm_20 - Target SM 20.
sm_21 - Target SM 21.
sm_30 - Target SM 30.
sm_32 - Target SM 32.
sm_35 - Target SM 35.
sm_37 - Target SM 37.
sm_50 - Target SM 50.
sm_52 - Target SM 52.
sm_53 - Target SM 53.
sm_60 - Target SM 60.
sm_61 - Target SM 61.
sm_62 - Target SM 62.
sm_70 - Target SM 70.
sm_72 - Target SM 72.
sm_75 - Target SM 75.
sm_80 - Target SM 80.
sm_86 - Target SM 86.
sm_87 - Target SM 87.
sm_89 - Target SM 89.
sm_90 - Target SM 90.
sm_90a - Target SM 90a.
Use +feature to enable a feature, or -feature to disable it.
For example, rustc -C target-cpu=mycpu -C target-feature=+feature1,-feature2
Code-generation features cannot be used in cfg or #[target_feature],
and may be renamed or removed in a future version of LLVM or rustc. |
Btw, is it intentional that the |
Thanks @taiki-e. I'll update accordingly. I'm seeing PTX 7.8 being written in an otherwise-default configuration with target |
Tracking issue: #44839 (catch-all arches)
The feature gate is
#![feature(nvptx_target_feature)]
This exposes the target features
sm_20
throughsm_120a
as defined by LLVM.Cc: @gonzalobg
@rustbot label +O-NVPTX +A-target-feature