Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize hash map operations in the query system #115747

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Zoxc
Copy link
Contributor

@Zoxc Zoxc commented Sep 11, 2023

This optimizes hash map operations in the query system by explicitly passing hashes and using more optimal operations. find_or_find_insert_slot in particular saves a hash table lookup over entry. It's not yet available in a safe API, but will be in rust-lang/hashbrown#466.

BenchmarkBeforeAfter
TimeTime%
🟣 clap:check1.6189s1.6129s -0.37%
🟣 hyper:check0.2353s0.2337s -0.67%
🟣 regex:check0.9344s0.9289s -0.59%
🟣 syn:check1.4693s1.4652s -0.28%
🟣 syntex_syntax:check5.6606s5.6439s -0.30%
Total9.9185s9.8846s -0.34%
Summary1.0000s0.9956s -0.44%

r? @cjgillot

@rustbot rustbot added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 11, 2023
@rustbot
Copy link
Collaborator

rustbot commented Sep 11, 2023

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@cjgillot
Copy link
Contributor

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 21, 2023
@bors
Copy link
Contributor

bors commented Sep 21, 2023

⌛ Trying commit 936d8ad with merge d898dc6...

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 21, 2023
Optimize hash map operations in the query system

This optimizes hash map operations in the query system by explicitly passing hashes and using more optimal operations. `find_or_find_insert_slot` in particular saves a hash table lookup over `entry`. It's not yet available in a safe API, but will be in rust-lang/hashbrown#466.

<table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">1.6189s</td><td align="right">1.6129s</td><td align="right"> -0.37%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.2353s</td><td align="right">0.2337s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">0.9344s</td><td align="right">0.9289s</td><td align="right"> -0.59%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">1.4693s</td><td align="right">1.4652s</td><td align="right"> -0.28%</td></tr><tr><td>🟣 <b>syntex_syntax</b>:check</td><td align="right">5.6606s</td><td align="right">5.6439s</td><td align="right"> -0.30%</td></tr><tr><td>Total</td><td align="right">9.9185s</td><td align="right">9.8846s</td><td align="right"> -0.34%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9956s</td><td align="right"> -0.44%</td></tr></table>

r? `@cjgillot`
@bors
Copy link
Contributor

bors commented Sep 21, 2023

☀️ Try build successful - checks-actions
Build commit: d898dc6 (d898dc6b90c747dee52de63fccc7f77f86171136)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (d898dc6): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.8% [0.7%, 0.9%] 2
Improvements ✅
(primary)
-0.3% [-0.5%, -0.3%] 14
Improvements ✅
(secondary)
-0.5% [-0.9%, -0.2%] 25
All ❌✅ (primary) -0.3% [-0.5%, -0.3%] 14

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.2% [3.1%, 3.5%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 630.615s -> 639.436s (1.40%)
Artifact size: 317.63 MiB -> 317.09 MiB (-0.17%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 21, 2023
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 23, 2023
[TEST] Optimize hash map operations in the query system

This is a variant of rust-lang#115747 without using the `hashbrown` crate to see if that change the bootstrap impact.

r? `@cjgillot`
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@Zoxc
Copy link
Contributor Author

Zoxc commented Sep 24, 2023

This could use another perf run to see if using the sysroot hashbrown crate helps with bootstrap times.

@lqd
Copy link
Member

lqd commented Sep 25, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 25, 2023
@bors
Copy link
Contributor

bors commented Sep 25, 2023

⌛ Trying commit 54d9510 with merge 2a31958...

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 25, 2023
Optimize hash map operations in the query system

This optimizes hash map operations in the query system by explicitly passing hashes and using more optimal operations. `find_or_find_insert_slot` in particular saves a hash table lookup over `entry`. It's not yet available in a safe API, but will be in rust-lang/hashbrown#466.

<table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">1.6189s</td><td align="right">1.6129s</td><td align="right"> -0.37%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.2353s</td><td align="right">0.2337s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">0.9344s</td><td align="right">0.9289s</td><td align="right"> -0.59%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">1.4693s</td><td align="right">1.4652s</td><td align="right"> -0.28%</td></tr><tr><td>🟣 <b>syntex_syntax</b>:check</td><td align="right">5.6606s</td><td align="right">5.6439s</td><td align="right"> -0.30%</td></tr><tr><td>Total</td><td align="right">9.9185s</td><td align="right">9.8846s</td><td align="right"> -0.34%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9956s</td><td align="right"> -0.44%</td></tr></table>

r? `@cjgillot`
@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9652a29): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.2% [0.2%, 0.2%] 1
Improvements ✅
(primary)
-0.2% [-0.3%, -0.1%] 35
Improvements ✅
(secondary)
-0.2% [-0.4%, -0.1%] 46
All ❌✅ (primary) -0.2% [-0.3%, -0.1%] 35

Max RSS (memory usage)

Results (primary -1.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.2% [-1.2%, -1.2%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.2% [-1.2%, -1.2%] 1

Cycles

Results (primary -1.4%, secondary -3.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.4% [-2.1%, -1.0%] 8
Improvements ✅
(secondary)
-3.0% [-3.9%, -1.9%] 4
All ❌✅ (primary) -1.4% [-2.1%, -1.0%] 8

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 773.717s -> 774.106s (0.05%)
Artifact size: 365.52 MiB -> 365.55 MiB (0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 19, 2025
@Zoxc
Copy link
Contributor Author

Zoxc commented Mar 19, 2025

Perfs looks good and there's no longer a bootstrap regression.

@oli-obk
Copy link
Contributor

oli-obk commented Mar 20, 2025

@bors r+

@bors
Copy link
Contributor

bors commented Mar 20, 2025

📌 Commit ebaedae has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 20, 2025
Comment on lines 32 to 36
[dependencies.hashbrown]
version = "0.15.2"
default-features = false
features = ["nightly"] # for may_dangle

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing this? I understand, that now code relies on whatever hashbrown version\features pulled in, so actual improvements can be from that version\features combination, not from actual code changes, and makes perf fragile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ cargo tree -p hashbrown@0.15 -e features -i --depth 2
hashbrown v0.15.2
`-- indexmap v2.7.0
    |-- object v0.36.7
    |-- wasmparser v0.219.1
    |-- wasmparser v0.223.0
    `-- wit-component v0.223.0
    |-- indexmap feature "default"
    |-- indexmap feature "serde"
    `-- indexmap feature "std"
|-- hashbrown feature "default-hasher"
|   |-- object v0.36.7 (*)
|   `-- wasmparser v0.223.0 (*)
|-- hashbrown feature "nightly"
|   `-- rustc_data_structures v0.0.0
`-- hashbrown feature "serde"
    `-- wasmparser feature "serde"

This was only user of nightly feature. For comparison, 0.14 version pulls different set of features:

$ cargo tree -p hashbrown@0.14 -e features -i --depth 2
hashbrown v0.14.5
|-- hashbrown feature "ahash"
|   `-- hashbrown feature "default"
|-- hashbrown feature "allocator-api2"
|   `-- hashbrown feature "default" (*)
|-- hashbrown feature "default" (*)
`-- hashbrown feature "inline-more"
    `-- hashbrown feature "default" (*)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ensures the HashMap and HashTable shares the same code instances, which could help with compile time. It also avoids compiling an extra hashbrown copy. Not sure how it makes perf fragile?

Copy link
Contributor

@klensy klensy Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ensures the HashMap and HashTable shares the same code instances, which could help with compile time.

Ok.

It also avoids compiling an extra hashbrown copy.

Isn't hashbrown 0.15 still pulled in via other crates?

Not sure how it makes perf fragile?

This relies on unspecified hashbrown, i.e. updating some other dependencies will in the future pull different hashbrown and affect this code (without knowing). Can you place back whatever working hashbrown configuration work now, so exact config will be tracked (yes, there is features merging, but still)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do a new perf run with the crates version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be nice to just add inline-more to hashbrown 0.15 and see results independently (of this PR). For example, when next version of https://github.com/rust-lang/thorin will be released, this will kill remaining hashbrown 0.14 and merge inline-more feature with others of 0.15.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zoxc
Copy link
Contributor Author

Zoxc commented Mar 20, 2025

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 20, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 21, 2025
Try using std's hashbrown copy

Just testing if this affects performance. It may relate to rust-lang#138708, rust-lang#137701 and rust-lang#115747.
@oli-obk
Copy link
Contributor

oli-obk commented Mar 21, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 21, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 21, 2025
Optimize hash map operations in the query system

This optimizes hash map operations in the query system by explicitly passing hashes and using more optimal operations. `find_or_find_insert_slot` in particular saves a hash table lookup over `entry`. It's not yet available in a safe API, but will be in rust-lang/hashbrown#466.

<table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">1.6189s</td><td align="right">1.6129s</td><td align="right"> -0.37%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.2353s</td><td align="right">0.2337s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">0.9344s</td><td align="right">0.9289s</td><td align="right"> -0.59%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">1.4693s</td><td align="right">1.4652s</td><td align="right"> -0.28%</td></tr><tr><td>🟣 <b>syntex_syntax</b>:check</td><td align="right">5.6606s</td><td align="right">5.6439s</td><td align="right"> -0.30%</td></tr><tr><td>Total</td><td align="right">9.9185s</td><td align="right">9.8846s</td><td align="right"> -0.34%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9956s</td><td align="right"> -0.44%</td></tr></table>

r? `@cjgillot`
@bors
Copy link
Contributor

bors commented Mar 21, 2025

⌛ Trying commit 93bfe39 with merge f0ac6ea...

@bors
Copy link
Contributor

bors commented Mar 21, 2025

☀️ Try build successful - checks-actions
Build commit: f0ac6ea (f0ac6ea50f6a5303bd2a0a3857fe0b901ecb33d6)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f0ac6ea): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.2% [-0.2%, -0.1%] 14
Improvements ✅
(secondary)
-0.2% [-0.4%, -0.1%] 33
All ❌✅ (primary) -0.2% [-0.2%, -0.1%] 14

Max RSS (memory usage)

Results (primary 2.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.5% [2.5%, 2.5%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.5% [2.5%, 2.5%] 1

Cycles

Results (secondary 3.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.9% [3.7%, 4.0%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 774.966s -> 774.408s (-0.07%)
Artifact size: 365.53 MiB -> 365.60 MiB (0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 21, 2025
@Zoxc
Copy link
Contributor Author

Zoxc commented Mar 21, 2025

The cycle / wall-times are worse, but that may just be noise. We can probably just stick with the crates version.

@Zoxc
Copy link
Contributor Author

Zoxc commented Mar 22, 2025

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet