Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel ledger hashing #16053

Open
georgeee opened this issue Sep 16, 2024 · 1 comment
Open

Parallel ledger hashing #16053

georgeee opened this issue Sep 16, 2024 · 1 comment
Assignees

Comments

@georgeee
Copy link
Member

PR #15980 introduces a neat trick that makes ledger hashes for a mask (i.e. a block) to be computed in a single function call, hashing is executed layer by layer.

We could go one step further and come up with a batch version of hash function.

As of the moment, we rely on block_cipher exposed from the rust poseidon implementation. Sponge construction as a whole is implemented in snarky. What could be done is the following:

  1. Start using Rust's implementation for poseidon hash (including the sponge construction) // maybe padding/block splitting could be left in Ocaml for convenience
  2. Come up with a new function in Rust interface to compute a hash batch (instead of individual hash)
  3. Use rayon's par_iter in implementation of hash function to utilize multicore capabilities

Motivation

At this stage (after closing #14752, to be precise), stage ledger diff application's cost is dominated by computing various hashes:

  • Hash of an account (bottom layer of ledger)
  • Merge hash (non-bottom layers of ledger)
  • Receipt chain hash
  • Account's actions state hash

Overwhelming majority of cost (>80%) for processing a max-account-update block comes from merge/account hashes. As measured on server, it's around 1.2s for a max-account-update block, and it uses a single thread. When executed on a 12-core server, it has potential to be reduced to ~300ms (napkin math™).

@georgeee georgeee self-assigned this Sep 16, 2024
@georgeee georgeee changed the title Parallelize ledger hashing Parallel ledger hashing Sep 16, 2024
@georgeee
Copy link
Member Author

Additional motivation to work on this item: it could significantly speedup catchup.

Assuming networking communication is perfect (and no bugs of supercatchup are activated), bottom line for catchup is building breadcrumbs, which is almost entirely down to hashing as part of staged ledger diff application. Same holds for loading ledger from persistence (although in this case we could store mask hashes in persistence to avoid the need for re-hashing).

By another estimation (napkin math™), we could improve catchup's bottom line by 50% on servers with >= 4 CPU cores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant