Optimize hash map operations in the query system #115747

Zoxc · 2023-09-11T06:21:16Z

This optimizes hash map operations in the query system by explicitly passing hashes and using more optimal operations. find_or_find_insert_slot in particular saves a hash table lookup over entry. It's not yet available in a safe API, but will be in rust-lang/hashbrown#466.

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check	1.6189s	1.6129s	-0.37%
🟣 hyper:check	0.2353s	0.2337s	-0.67%
🟣 regex:check	0.9344s	0.9289s	-0.59%
🟣 syn:check	1.4693s	1.4652s	-0.28%
🟣 syntex_syntax:check	5.6606s	5.6439s	-0.30%
Total	9.9185s	9.8846s	-0.34%
Summary	1.0000s	0.9956s	-0.44%

r? @cjgillot

rustbot · 2023-09-11T06:21:26Z

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

cjgillot · 2023-09-21T17:03:12Z

@bors try @rust-timer queue

bors · 2023-09-21T17:03:23Z

⌛ Trying commit 936d8ad with merge d898dc6...

Optimize hash map operations in the query system This optimizes hash map operations in the query system by explicitly passing hashes and using more optimal operations. `find_or_find_insert_slot` in particular saves a hash table lookup over `entry`. It's not yet available in a safe API, but will be in rust-lang/hashbrown#466. <table><tr><td rowspan="2">Benchmark</td><td colspan="1">Before</th><td colspan="2">After</th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 clap:check</td><td align="right">1.6189s</td><td align="right">1.6129s</td><td align="right"> -0.37%</td></tr><tr><td>🟣 hyper:check</td><td align="right">0.2353s</td><td align="right">0.2337s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 regex:check</td><td align="right">0.9344s</td><td align="right">0.9289s</td><td align="right"> -0.59%</td></tr><tr><td>🟣 syn:check</td><td align="right">1.4693s</td><td align="right">1.4652s</td><td align="right"> -0.28%</td></tr><tr><td>🟣 syntex_syntax:check</td><td align="right">5.6606s</td><td align="right">5.6439s</td><td align="right"> -0.30%</td></tr><tr><td>Total</td><td align="right">9.9185s</td><td align="right">9.8846s</td><td align="right"> -0.34%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9956s</td><td align="right"> -0.44%</td></tr></table> r? `@cjgillot`

bors · 2023-09-21T18:15:03Z

☀️ Try build successful - checks-actions
Build commit: d898dc6 (d898dc6b90c747dee52de63fccc7f77f86171136)

rust-timer · 2023-09-21T19:38:13Z

Finished benchmarking commit (d898dc6): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.8%	[0.7%, 0.9%]	2
Improvements ✅ (primary)	-0.3%	[-0.5%, -0.3%]	14
Improvements ✅ (secondary)	-0.5%	[-0.9%, -0.2%]	25
All ❌✅ (primary)	-0.3%	[-0.5%, -0.3%]	14

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.2%	[3.1%, 3.5%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 630.615s -> 639.436s (1.40%)
Artifact size: 317.63 MiB -> 317.09 MiB (-0.17%)

[TEST] Optimize hash map operations in the query system This is a variant of rust-lang#115747 without using the `hashbrown` crate to see if that change the bootstrap impact. r? `@cjgillot`

Zoxc · 2023-09-24T23:03:24Z

This could use another perf run to see if using the sysroot hashbrown crate helps with bootstrap times.

lqd · 2023-09-25T05:44:01Z

@bors try @rust-timer queue

bors · 2023-09-25T05:45:10Z

⌛ Trying commit 54d9510 with merge 2a31958...

Optimize hash map operations in the query system This optimizes hash map operations in the query system by explicitly passing hashes and using more optimal operations. `find_or_find_insert_slot` in particular saves a hash table lookup over `entry`. It's not yet available in a safe API, but will be in rust-lang/hashbrown#466. <table><tr><td rowspan="2">Benchmark</td><td colspan="1">Before</th><td colspan="2">After</th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 clap:check</td><td align="right">1.6189s</td><td align="right">1.6129s</td><td align="right"> -0.37%</td></tr><tr><td>🟣 hyper:check</td><td align="right">0.2353s</td><td align="right">0.2337s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 regex:check</td><td align="right">0.9344s</td><td align="right">0.9289s</td><td align="right"> -0.59%</td></tr><tr><td>🟣 syn:check</td><td align="right">1.4693s</td><td align="right">1.4652s</td><td align="right"> -0.28%</td></tr><tr><td>🟣 syntex_syntax:check</td><td align="right">5.6606s</td><td align="right">5.6439s</td><td align="right"> -0.30%</td></tr><tr><td>Total</td><td align="right">9.9185s</td><td align="right">9.8846s</td><td align="right"> -0.34%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9956s</td><td align="right"> -0.44%</td></tr></table> r? `@cjgillot`

rust-timer · 2025-03-19T23:26:04Z

Finished benchmarking commit (9652a29): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.2%	[0.2%, 0.2%]	1
Improvements ✅ (primary)	-0.2%	[-0.3%, -0.1%]	35
Improvements ✅ (secondary)	-0.2%	[-0.4%, -0.1%]	46
All ❌✅ (primary)	-0.2%	[-0.3%, -0.1%]	35

Max RSS (memory usage)

Results (primary -1.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.2%	[-1.2%, -1.2%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.2%	[-1.2%, -1.2%]	1

Cycles

Results (primary -1.4%, secondary -3.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.4%	[-2.1%, -1.0%]	8
Improvements ✅ (secondary)	-3.0%	[-3.9%, -1.9%]	4
All ❌✅ (primary)	-1.4%	[-2.1%, -1.0%]	8

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 773.717s -> 774.106s (0.05%)
Artifact size: 365.52 MiB -> 365.55 MiB (0.01%)

Zoxc · 2025-03-19T23:52:15Z

Perfs looks good and there's no longer a bootstrap regression.

oli-obk · 2025-03-20T15:08:26Z

@bors r+

bors · 2025-03-20T15:08:29Z

📌 Commit ebaedae has been approved by oli-obk

It is now in the queue for this repository.

klensy · 2025-03-20T16:16:53Z

compiler/rustc_data_structures/Cargo.toml

-[dependencies.hashbrown]
-version = "0.15.2"
-default-features = false
-features = ["nightly"] # for may_dangle
-


Why removing this? I understand, that now code relies on whatever hashbrown version\features pulled in, so actual improvements can be from that version\features combination, not from actual code changes, and makes perf fragile.

$ cargo tree -p hashbrown@0.15 -e features -i --depth 2 hashbrown v0.15.2 `-- indexmap v2.7.0 |-- object v0.36.7 |-- wasmparser v0.219.1 |-- wasmparser v0.223.0 `-- wit-component v0.223.0 |-- indexmap feature "default" |-- indexmap feature "serde" `-- indexmap feature "std" |-- hashbrown feature "default-hasher" | |-- object v0.36.7 (*) | `-- wasmparser v0.223.0 (*) |-- hashbrown feature "nightly" | `-- rustc_data_structures v0.0.0 `-- hashbrown feature "serde" `-- wasmparser feature "serde"

This was only user of nightly feature. For comparison, 0.14 version pulls different set of features:

$ cargo tree -p hashbrown@0.14 -e features -i --depth 2 hashbrown v0.14.5 |-- hashbrown feature "ahash" | `-- hashbrown feature "default" |-- hashbrown feature "allocator-api2" | `-- hashbrown feature "default" (*) |-- hashbrown feature "default" (*) `-- hashbrown feature "inline-more" `-- hashbrown feature "default" (*)

It ensures the HashMap and HashTable shares the same code instances, which could help with compile time. It also avoids compiling an extra hashbrown copy. Not sure how it makes perf fragile?

It ensures the HashMap and HashTable shares the same code instances, which could help with compile time.

Ok.

It also avoids compiling an extra hashbrown copy.

Isn't hashbrown 0.15 still pulled in via other crates?

Not sure how it makes perf fragile?

This relies on unspecified hashbrown, i.e. updating some other dependencies will in the future pull different hashbrown and affect this code (without knowing). Can you place back whatever working hashbrown configuration work now, so exact config will be tracked (yes, there is features merging, but still)?

We could do a new perf run with the crates version.

It will be nice to just add inline-more to hashbrown 0.15 and see results independently (of this PR). For example, when next version of https://github.com/rust-lang/thorin will be released, this will kill remaining hashbrown 0.14 and merge inline-more feature with others of 0.15.

Posted rust-lang/thorin#39

Zoxc · 2025-03-20T17:15:33Z

@bors r-

Try using std's hashbrown copy Just testing if this affects performance. It may relate to rust-lang#138708, rust-lang#137701 and rust-lang#115747.

oli-obk · 2025-03-21T08:05:30Z

@bors try @rust-timer queue

Optimize hash map operations in the query system This optimizes hash map operations in the query system by explicitly passing hashes and using more optimal operations. `find_or_find_insert_slot` in particular saves a hash table lookup over `entry`. It's not yet available in a safe API, but will be in rust-lang/hashbrown#466. <table><tr><td rowspan="2">Benchmark</td><td colspan="1">Before</th><td colspan="2">After</th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 clap:check</td><td align="right">1.6189s</td><td align="right">1.6129s</td><td align="right"> -0.37%</td></tr><tr><td>🟣 hyper:check</td><td align="right">0.2353s</td><td align="right">0.2337s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 regex:check</td><td align="right">0.9344s</td><td align="right">0.9289s</td><td align="right"> -0.59%</td></tr><tr><td>🟣 syn:check</td><td align="right">1.4693s</td><td align="right">1.4652s</td><td align="right"> -0.28%</td></tr><tr><td>🟣 syntex_syntax:check</td><td align="right">5.6606s</td><td align="right">5.6439s</td><td align="right"> -0.30%</td></tr><tr><td>Total</td><td align="right">9.9185s</td><td align="right">9.8846s</td><td align="right"> -0.34%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9956s</td><td align="right"> -0.44%</td></tr></table> r? `@cjgillot`

bors · 2025-03-21T08:07:51Z

⌛ Trying commit 93bfe39 with merge f0ac6ea...

bors · 2025-03-21T10:13:14Z

☀️ Try build successful - checks-actions
Build commit: f0ac6ea (f0ac6ea50f6a5303bd2a0a3857fe0b901ecb33d6)

rust-timer · 2025-03-21T11:54:24Z

Finished benchmarking commit (f0ac6ea): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-0.2%, -0.1%]	14
Improvements ✅ (secondary)	-0.2%	[-0.4%, -0.1%]	33
All ❌✅ (primary)	-0.2%	[-0.2%, -0.1%]	14

Max RSS (memory usage)

Results (primary 2.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.5%	[2.5%, 2.5%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.5%	[2.5%, 2.5%]	1

Cycles

Results (secondary 3.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.9%	[3.7%, 4.0%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 774.966s -> 774.408s (-0.07%)
Artifact size: 365.53 MiB -> 365.60 MiB (0.02%)

Zoxc · 2025-03-21T16:29:29Z

The cycle / wall-times are worse, but that may just be noise. We can probably just stick with the crates version.

Zoxc · 2025-03-22T23:13:49Z

@rustbot ready

rustbot assigned cjgillot Sep 11, 2023

rustbot added A-query-system S-waiting-on-review T-compiler labels Sep 11, 2023

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf label Sep 21, 2023

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf label Sep 21, 2023

Zoxc force-pushed the query-hashes branch from 936d8ad to a5c8caa Compare September 22, 2023 20:26

Zoxc mentioned this pull request Sep 23, 2023

[TEST] Optimize hash map operations in the query system #116105

Closed

Zoxc force-pushed the query-hashes branch from a5c8caa to b388e29 Compare September 24, 2023 20:50

This comment has been minimized.

Sign in to view

Zoxc force-pushed the query-hashes branch from b388e29 to bda5318 Compare September 24, 2023 21:02

This comment has been minimized.

Sign in to view

Zoxc force-pushed the query-hashes branch from bda5318 to 45c8171 Compare September 24, 2023 21:08

This comment has been minimized.

Sign in to view

Zoxc force-pushed the query-hashes branch from 45c8171 to 54d9510 Compare September 24, 2023 21:26

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf label Sep 25, 2023

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf label Mar 19, 2025

bors added S-waiting-on-bors and removed S-waiting-on-author labels Mar 20, 2025

klensy reviewed Mar 20, 2025

View reviewed changes

bors added S-waiting-on-author and removed S-waiting-on-bors labels Mar 20, 2025

Zoxc mentioned this pull request Mar 21, 2025

Try using std's hashbrown copy #138769

Closed

Zoxc added 2 commits March 21, 2025 07:51

Optimize hash map operations in the query system

fcd3349

Use hashbrown from crates.io

93bfe39

Zoxc force-pushed the query-hashes branch from ebaedae to 93bfe39 Compare March 21, 2025 06:56

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf label Mar 21, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf label Mar 21, 2025

rustbot added S-waiting-on-review and removed S-waiting-on-author labels Mar 22, 2025

Optimize hash map operations in the query system #115747

Are you sure you want to change the base?

Optimize hash map operations in the query system #115747

Conversation

Zoxc commented Sep 11, 2023

rustbot commented Sep 11, 2023

cjgillot commented Sep 21, 2023

This comment has been minimized.

bors commented Sep 21, 2023

bors commented Sep 21, 2023

This comment has been minimized.

rust-timer commented Sep 21, 2023

Overall result: ✅ improvements - no action needed

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Zoxc commented Sep 24, 2023

lqd commented Sep 25, 2023

This comment has been minimized.

bors commented Sep 25, 2023

This comment has been minimized.

rust-timer commented Mar 19, 2025

Overall result: ✅ improvements - no action needed

Zoxc commented Mar 19, 2025

oli-obk commented Mar 20, 2025

bors commented Mar 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klensy Mar 20, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zoxc commented Mar 20, 2025

oli-obk commented Mar 21, 2025

This comment has been minimized.

bors commented Mar 21, 2025

bors commented Mar 21, 2025

This comment has been minimized.

rust-timer commented Mar 21, 2025

Overall result: ✅ improvements - no action needed

Zoxc commented Mar 21, 2025

Zoxc commented Mar 22, 2025

klensy Mar 20, 2025 •

edited

Loading