-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid no-op unlink+link dances in incr comp #128320
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Avoid no-op unlink+link dances in incr comp Incremental compilation scales quite poorly with the number of CGUs. This PR improves one reason for that. The incr comp process hard-links all the files from an old session into a new one, then it runs the backend, which may just hard-link the new session files into the output directory. Then codegen hard-links all the output files back to the new session directory. This PR (perhaps unimaginatively) fixes the silliness that ensues in the last step. The old `link_or_copy` implementation would be passed pairs of paths which are already the same inode, then it would blindly delete the destination and re-create the hard-link that it just deleted. This PR lets us skip both those operations. We don't skip the other two hard-links. `cargo +stage1 b && touch crates/core/main.rs && strace -cfw -elink,linkat,unlink,unlinkat cargo +stage1 b` before and then after on `ripgrep-13.0.0`: ``` % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 52.56 0.024950 25 978 485 unlink 34.38 0.016318 22 727 linkat 13.06 0.006200 24 249 unlinkat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.047467 24 1954 485 total ``` ``` % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 42.83 0.014521 57 252 unlink 38.41 0.013021 26 486 linkat 18.77 0.006362 25 249 unlinkat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.033904 34 987 total ``` r? `@ghost`
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Avoid no-op unlink+link dances in incr comp Incremental compilation scales quite poorly with the number of CGUs. This PR improves one reason for that. The incr comp process hard-links all the files from an old session into a new one, then it runs the backend, which may just hard-link the new session files into the output directory. Then codegen hard-links all the output files back to the new session directory. This PR (perhaps unimaginatively) fixes the silliness that ensues in the last step. The old `link_or_copy` implementation would be passed pairs of paths which are already the same inode, then it would blindly delete the destination and re-create the hard-link that it just deleted. This PR lets us skip both those operations. We don't skip the other two hard-links. `cargo +stage1 b && touch crates/core/main.rs && strace -cfw -elink,linkat,unlink,unlinkat cargo +stage1 b` before and then after on `ripgrep-13.0.0`: ``` % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 52.56 0.024950 25 978 485 unlink 34.38 0.016318 22 727 linkat 13.06 0.006200 24 249 unlinkat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.047467 24 1954 485 total ``` ``` % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 42.83 0.014521 57 252 unlink 38.41 0.013021 26 486 linkat 18.77 0.006362 25 249 unlinkat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.033904 34 987 total ``` r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (a712ff3): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResults (primary 1.0%, secondary -3.7%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 767.825s -> 768.215s (0.05%) |
26319ff
to
3d55891
Compare
3d55891
to
6f5792c
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 |
r? compiler |
Another option would be comparing the dev/ino (or the windows equivalents) of the source and destination file and skip the linking if they're already the same. |
The implementation in this PR strictly reduces the amount we interact with the filesystem by using information that's already trivially available to the compiler. Apart from redesigning the incremental compilation CGU reuse system, I don't see any alternative to this optimization that can compete with it for efficiency. It might be good to also not re-hardlink files that are already hardlinked, but on Linux I'm not sure that is an improvement. |
Yeah I was thinking of the linked issues that were about hardlink slowness on macos and windows. |
6244f8d
to
cae7c76
Compare
Somebody drive-by reviewed this and made a suggestion that seems to work but now I don't even know how to respond to them. Please. Do not delete your own comments. |
@@ -487,6 +503,7 @@ fn reuse_workproduct_for_cgu( | |||
bytecode: None, | |||
assembly: None, | |||
llvm_ir: None, | |||
links_from_incr_cache: links_from_incr_cache.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both CompiledModule
should use a separate links_from_incr_cache
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh 🤦 I see now
1932ee5
to
05cf391
Compare
This comment has been minimized.
This comment has been minimized.
05cf391
to
e21502c
Compare
cg_clif changes LGTM |
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (48b36c9): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary -7.5%, secondary 3.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 7.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 775.146s -> 775.347s (0.03%) |
Incremental compilation scales quite poorly with the number of CGUs. This PR improves one reason for that.
The incr comp process hard-links all the files from an old session into a new one, then it runs the backend, which may just hard-link the new session files into the output directory. Then codegen hard-links all the output files back to the new session directory.
This PR (perhaps unimaginatively) fixes the silliness that ensues in the last step. The old
link_or_copy
implementation would be passed pairs of paths which are already the same inode, then it would blindly delete the destination and re-create the hard-link that it just deleted. This PR lets us skip both those operations. We don't skip the other two hard-links.cargo +stage1 b && touch crates/core/main.rs && strace -cfw -elink,linkat,unlink,unlinkat cargo +stage1 b
before and then after onripgrep-13.0.0
:This reduces the number of hard-links that are causing perf troubles, noted in #64291 and #137560