-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constructive/Destructive Interference Size Padding #298
Comments
We discussed this in today's @rust-lang/libs-api meeting. We felt that this should not be exclusively a library value, because that wouldn't allow using it in |
This has been mentioned in several of the linked discussions/resources but for completeness I also want to spell it out here: these numbers can't be defined precisely at compile time, since CPU models that are otherwise functionally identical can have rather different cache systems. They can only be provided on a best-effort basis, using knowledge of currently existing microarchitectures. Of course to be useful for alignment/padding, there has to be some compile-time answer and an overly conservative choice is fine. But to remain useful, the value has to be updated when new CPU models with larger "interference size" are released. The C++ equivalent of this feature can't be updated easily because changing the value is generally ABI-breaking. In particular, the value can't vary based on flags like |
Only if types use the Wouldn't it then be sufficient to mark such types as not FFI-safe? |
I'm not quite sure. It's definitely not the case that full type layout is always stored in the defining crate's metadata because that depends on generic parameters which are only determined (sometimes repeatedly) in downstream crates. For example, today one can define something like this: #[repr(align(64))]
pub struct Align64<T>(pub T); ... and the actual layout of If every “cache line padded” type's alignment is determined by the flags the defining crate was compiled with, that would avoid the worst problems. But it still has some surprising consequences:
It may still be simpler to tell people "if you want to increase this constant, that's a target modifier" and direct them to -Zbuild-std or its eventual stable equivalent. |
The unsafe code would somehow have to know that those separate definitions in separate crates are identical, including all their attributes, but somehow be unaware of the weird align value.
We have those kinds of downsides with all other optimization-things in std too. But yeah, build-std would be great for this.
Cache-padded structures are usually some internal runtime thing, they aren't portable, so they shouldn't get serialized or anything in the bytemuck territory. |
It has to be aware that the In any case, I don't want to litigate every possible side effect this may have. My main point is that a thing that's mostly thought of as a compile time constant, but is occasionally different across crates that are linked together, would be really weird. Especially if there's also an accompanying constant in the standard library for it, which necessarily has to pick one value that every crate sees. None of that is a fatal problem for this proposal, but either we say "this is actually a constant and not modified by compiler flags" (with the downsides already discussed in the C++ context) or we'll add a new entry to the "I did not know Rust was weird like that" trivia rotation. |
Proposal
Problem statement
Currently, the standard library does not provide a built-in mechanism to add hardware constructive/destructive interference size padding to a struct to avoid false-sharing or promote true-sharing of instances of the same struct. These can be important tools in performance optimizations and are currently implemented independently in multiple crates.
Motivating examples or use cases
As documented in rust issue #117470, aligning a struct to the cache line size can provide significant performance benefits. Multiple crates, including
mpmc::utils
in rustc,crossbeam-utils
, andregex
implement their own version of cache padding, which would most likely be improved by std offering a built-in implementation of constructive/destructive interference size padding.Solution sketch
A solution could look like a hard-coded value table like the implementation in
crossbeam-utils
, offered in rust pr #117519, although this approach is likely to be inaccurate in some cases and would require continuous updating of the values.Other approaches are offered by the C++ implementation, linked below. Additionally, LLVM offers built-in methods for getting target arch cache line sizes.
Alternatives
Links and related work
Zulip discussion:
https://rust-lang.zulipchat.com/#narrow/stream/327149-t-libs-api.2Fapi-changes/topic/adding.20CachePadded.20to.20std
Existing implementations:
crossbeam-utils
:https://docs.rs/crossbeam/latest/crossbeam/utils/struct.CachePadded.html
mpmc-utils
:https://doc.rust-lang.org/beta/src/std/sync/mpmc/utils.rs.html
This feature exists in C++ as
std::hardware_destructive_interference_size
andstd::hardware_constructive_interference_size
, https://en.cppreference.com/w/cpp/thread/hardware_destructive_interference_size . The original paper describing the feature offers a few possible implementations: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0154r1.html .LLVM
GetCacheLineSize
:https://llvm.org/doxygen/classllvm_1_1MCSubtargetInfo.html#ac4be4ef1a969f0da1aa2da9aa5ccfe45
What happens now?
This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.
Possible responses
The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):
Second, if there's a concrete solution:
The text was updated successfully, but these errors were encountered: