Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resume one waiter at once in deadlock handler #137731

Merged
merged 1 commit into from
Mar 5, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 15 additions & 2 deletions compiler/rustc_query_system/src/query/job.rs
Original file line number Diff line number Diff line change
@@ -477,8 +477,8 @@ fn remove_cycle(
/// Detects query cycles by using depth first search over all active query jobs.
/// If a query cycle is found it will break the cycle by finding an edge which
/// uses a query latch and then resuming that waiter.
/// There may be multiple cycles involved in a deadlock, so this searches
/// all active queries for cycles before finally resuming all the waiters at once.
/// There may be multiple cycles involved in a deadlock, but we only search
/// one cycle at a call and resume one waiter at once. See `FIXME` below.
pub fn break_query_cycles(query_map: QueryMap, registry: &rayon_core::Registry) {
let mut wakelist = Vec::new();
let mut jobs: Vec<QueryJobId> = query_map.keys().cloned().collect();
@@ -488,6 +488,19 @@ pub fn break_query_cycles(query_map: QueryMap, registry: &rayon_core::Registry)
while jobs.len() > 0 {
if remove_cycle(&query_map, &mut jobs, &mut wakelist) {
found_cycle = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd try to put a break below found_cycle = true;, with a FIXME to remove it later, once the real cause of the deadlocks is fixed. All other changes in the PR could be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it sounds good. So remove_cycle doesn't always return true when there are deadlocks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Has done it


// FIXME(#137731): Resume all the waiters at once may cause deadlocks,
// so we resume one waiter at a call for now. It's still unclear whether
// it's due to possible issues in rustc-rayon or instead in the handling
// of query cycles.
// This seem to only appear when multiple query cycles errors
// are involved, so this reduction in parallelism, while suboptimal, is not
// universal and only the deadlock handler will encounter these cases.
// The workaround shows loss of potential gains, but there still are big
// improvements in the common case, and no regressions compared to the
// single-threaded case. More investigation is still needed, and once fixed,
// we can wake up all the waiters up.
break;
}
}

Loading