Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: use a queue-based Condvar on NetBSD and other platforms #127578

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions library/std/src/sys/pal/windows/c.rs
Original file line number Diff line number Diff line change
@@ -52,8 +52,6 @@ pub const INVALID_HANDLE_VALUE: HANDLE = ::core::ptr::without_provenance_mut(-1i
pub const EXIT_SUCCESS: u32 = 0;
pub const EXIT_FAILURE: u32 = 1;

#[cfg(target_vendor = "win7")]
pub const CONDITION_VARIABLE_INIT: CONDITION_VARIABLE = CONDITION_VARIABLE { Ptr: ptr::null_mut() };
#[cfg(target_vendor = "win7")]
pub const SRWLOCK_INIT: SRWLOCK = SRWLOCK { Ptr: ptr::null_mut() };
#[cfg(not(target_thread_local))]
21 changes: 9 additions & 12 deletions library/std/src/sys/sync/condvar/mod.rs
Original file line number Diff line number Diff line change
@@ -12,24 +12,21 @@ cfg_if::cfg_if! {
))] {
mod futex;
pub use futex::Condvar;
} else if #[cfg(any(
all(target_os = "windows", target_vendor = "win7"),
target_os = "netbsd",
all(target_vendor = "fortanix", target_env = "sgx"),
target_os = "teeos",
target_os = "xous",
))] {
mod queue;
pub use queue::Condvar;
} else if #[cfg(target_family = "unix")] {
mod pthread;
pub use pthread::Condvar;
} else if #[cfg(all(target_os = "windows", target_vendor = "win7"))] {
mod windows7;
pub use windows7::Condvar;
} else if #[cfg(all(target_vendor = "fortanix", target_env = "sgx"))] {
mod sgx;
pub use sgx::Condvar;
} else if #[cfg(target_os = "solid_asp3")] {
mod itron;
pub use itron::Condvar;
} else if #[cfg(target_os = "teeos")] {
mod teeos;
pub use teeos::Condvar;
} else if #[cfg(target_os = "xous")] {
mod xous;
pub use xous::Condvar;
} else {
mod no_threads;
pub use no_threads::Condvar;
380 changes: 380 additions & 0 deletions library/std/src/sys/sync/condvar/queue.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,380 @@
//! A generic `Condvar` implementation based on thread parking and a lockless
//! queue of threads.
//!
//! Not all platforms provide an efficient `Condvar` implementation: the UNIX
//! `pthread_condvar_t` needs memory allocation, while SGX doesn't have
//! synchronization primitives at all. Therefore, we implement our own.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think this is the right tradeoff? It looks like the affected platforms are fairly "rare", and I'm not sure it makes sense for std to be maintaining relatively complex data structures for such platforms, especially if they're not even getting lightly tested in our CI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that every couple of months someone comes along and adds a completely new platform to std. Understandably, not every target maintainer is well-versed in writing synchronization primitives, leading to bugs like #114581 on SGX. Actually, SGX is an interesting case, because the environment only provides thread parking, so there'll always be the need for some queue like this one. The idea behind this PR is that this implementation will be used as the default on every new target, unless they have a good reason not to, so that we only have to maintain one admittedly complex implementation instead of five.

There are simpler implementations that I could switch to, if desired (the NetBSD pthread_condvar for example uses a stack of waiters and spins on contended access to it). It's just that making the shared implementation really good seemed like a good idea to me (it also might make it easier to convince people that they really shouldn't bring their own).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yeah, if this is intended to be the fallback queue absent platform-specific, built-in queues that make more sense, then that makes more sense to me.

//!
//! To do so, we keep a list of the [`Thread`]s waiting on the `Condvar` and
//! wake them up as needed. Access to the list is controlled by an atomic
//! counter. To notify a waiter, the counter is incremented. If the counter
//! was previously zero, the notifying thread has control over the list and
//! will wake up threads until the number of threads it has woken up equals
//! the counter value. Therefore, other threads do not need to wait for control
//! over the list because the controlling thread will take over their notification.
//!
//! This counter is embedded into the lower bits of a pointer to the list head.
//! As that limits the number of in-flight notifications, the counter increments
//! are saturated to a maximum value ([`ALL`]) that causes all threads to be woken
//! up, leading to a spurious wakeup for some threads. The API of `Condvar` permits
//! this however. Timeouts employ the same method to make sure that the current
//! thread handle is removed from the list.
//!
//! The list itself has the same structure as the one used by the queue-based
//! `RwLock` implementation, see its documentation for more information. This
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to share the code for that queue, rather than having duplicate code with the same structure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it would be a bit complicated, since the queue-based RwLock is used on more platforms. Also, since they aren't accessed concurrently the nodes need atomic pointer here, but for RwLock that's not the case (there can be multiple reader threads that traverse the queue).

//! enables the lockless enqueuing of threads and results in `Condvar` being
//! only a pointer in size.
//!
//! This implementation is loosely based upon the lockless `Condvar` in
//! [`usync`](https://github.com/kprotty/usync/blob/8937bb77963f6bf9068e56ad46133e933eb79974/src/condvar.rs).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worries me, since that repository is only MIT licensed -- and std needs to also be Apache licensed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation shares no code with usync and contains significant alterations to the algorithm, so I don't think that this is a problem. Maybe "inspired by" is a better way to put it.

CC @kprotty what do you think?

#![forbid(unsafe_op_in_unsafe_fn)]

use crate::cell::UnsafeCell;
use crate::mem::forget;
use crate::ptr::{self, NonNull};
use crate::sync::atomic::Ordering::{AcqRel, Acquire, Relaxed, Release};
use crate::sync::atomic::{AtomicBool, AtomicPtr};
use crate::sys::sync::Mutex;
use crate::thread::{self, Thread};
use crate::time::{Duration, Instant};

type State = *mut ();

const EMPTY: State = ptr::null_mut();
const ALL: usize = 0b1111;
const MASK: usize = !ALL;

fn count(state: State) -> usize {
state.addr() & ALL
}

unsafe fn to_node(state: State) -> NonNull<Node> {
unsafe { NonNull::new_unchecked(state.mask(MASK)).cast() }
}

struct PanicGuard;
impl Drop for PanicGuard {
fn drop(&mut self) {
rtabort!("tried to drop node in intrusive list.");
}
}

#[repr(align(16))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here on why this is 16, and ideally also an assert somewhere that ensures our alignment is sufficient that MASK bits are never set?

struct Node {
// Accesses to these `UnsafeCell`s may only be made from the thread that
// first increment the wakeup count.
next: UnsafeCell<Option<NonNull<Node>>>,
prev: UnsafeCell<Option<NonNull<Node>>>,
tail: UnsafeCell<Option<NonNull<Node>>>,
notified: AtomicBool,
thread: Thread,
}

impl Node {
unsafe fn notify(node: NonNull<Node>) {
let thread = unsafe { node.as_ref().thread.clone() };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to clone the thread here?

Also, can we add safety requirements + justifications on the unsafe functions and blocks added by this PR?

unsafe {
node.as_ref().notified.store(true, Release);
}
thread.unpark();
}
}

/// Scan through the list until the `next` pointer of the current node equals
/// `known`, then return that node. Add backlinks to all encountered nodes.
unsafe fn scan_until_known(mut scan: NonNull<Node>, known: NonNull<Node>) -> NonNull<Node> {
loop {
let next = unsafe { scan.as_ref().next.get().read().unwrap_unchecked() };
if next != known {
unsafe {
next.as_ref().prev.get().write(Some(scan));
scan = next;
}
} else {
return scan;
}
}
}

/// Scan until encountering a node with a non-empty `tail` field, then return
/// the value of that field. Add backlinks to all encountered nodes.
unsafe fn scan_until_tail(mut scan: NonNull<Node>) -> NonNull<Node> {
loop {
let s = unsafe { scan.as_ref() };
match unsafe { s.tail.get().read() } {
Some(tail) => return tail,
None => unsafe {
let next = s.next.get().read().unwrap_unchecked();
next.as_ref().prev.get().write(Some(scan));
scan = next;
},
}
}
}

/// Notify all nodes, going backwards starting with `tail`.
unsafe fn notify_all(mut tail: NonNull<Node>) {
loop {
let prev = unsafe { tail.as_ref().prev.get().read() };
unsafe {
Node::notify(tail);
}
match prev {
Some(prev) => tail = prev,
None => return,
}
}
}

pub struct Condvar {
state: AtomicPtr<()>,
}

impl Condvar {
#[inline]
pub const fn new() -> Condvar {
Condvar { state: AtomicPtr::new(ptr::null_mut()) }
}

pub unsafe fn wait(&self, mutex: &Mutex) {
unsafe {
self.wait_optional_timeout(mutex, None);
}
}

pub unsafe fn wait_timeout(&self, mutex: &Mutex, timeout: Duration) -> bool {
let timeout = Instant::now().checked_add(timeout);
unsafe { self.wait_optional_timeout(mutex, timeout) }
}

unsafe fn wait_optional_timeout(&self, mutex: &Mutex, timeout: Option<Instant>) -> bool {
let node = &Node {
next: UnsafeCell::new(None),
prev: UnsafeCell::new(None),
tail: UnsafeCell::new(None),
notified: AtomicBool::new(false),
thread: thread::try_current().unwrap_or_else(|| Thread::new_unnamed()),
};

// Enqueue the node.
let mut state = self.state.load(Relaxed);
loop {
unsafe {
node.next.get().write(NonNull::new(state.mask(MASK).cast()));
node.tail.get().write(if state == EMPTY {
Some(NonNull::from(node).cast())
} else {
None
});
}

let next = ptr::from_ref(node).wrapping_byte_add(count(state)) as State;
match self.state.compare_exchange_weak(state, next, AcqRel, Relaxed) {
Ok(_) => break,
Err(new) => state = new,
}
}

// The node is registered, so the structure must not be
// mutably accessed or destroyed while other threads may
// be accessing it. Guard against unwinds using a panic
// guard that aborts when dropped.
let guard = PanicGuard;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this should get moved up to before we register the node, no?


unsafe {
mutex.unlock();
}

let mut timed_out = false;
if let Some(timeout) = timeout {
// While we haven't timed out or been notified, keep parking this thread.
while !node.notified.load(Acquire) {
if let Some(remaining) = timeout.checked_duration_since(Instant::now()) {
unsafe {
node.thread.park_timeout(remaining);
}
} else {
timed_out = true;
break;
}
}

if timed_out {
// The node is still in the queue. Wakeup all threads so that
// it is removed.
self.notify_all();
} else {
// The node was marked as notified, so it is no longer part of
// the queue. Relock the mutex and return.
forget(guard);
mutex.lock();
return true;
}
}

// Park the thread until we are notified.
while !node.notified.load(Acquire) {
unsafe {
node.thread.park();
}
}

// The node was marked as notified, so it is no longer part of
// the queue. Relock the mutex and return.
forget(guard);
mutex.lock();
!timed_out
}

pub fn notify_one(&self) {
// Try to increase the notification counter.
let mut state = self.state.load(Relaxed);
loop {
if state == EMPTY {
return;
}

if count(state) == ALL {
// All threads are being notified, so we don't need to do another
// notification.
return;
} else if count(state) != 0 {
// Another thread is handling notifications, tell it to notify
// one more thread.
let next = state.wrapping_byte_add(1);
match self.state.compare_exchange_weak(state, next, Relaxed, Relaxed) {
Ok(_) => return,
Err(new) => state = new,
}
} else {
// No notifications are in progress, we should take responsibility
// for waking up threads. Increase the notification counter to do so.
let next = state.wrapping_byte_add(1);
match self.state.compare_exchange_weak(state, next, Acquire, Relaxed) {
Ok(_) => {
state = next;
break;
}
Err(new) => state = new,
}
}
}

// At this point, we took responsibility for notifying threads, meaning
// we have exclusive access to the queue. Wake up threads as long as there
// are threads to notify and notifications requested.

// Keep track of how many threads we notified already.
let mut notified = 0;
// This is the node that will be woken up next.
let mut tail = unsafe { scan_until_tail(to_node(state)) };

while count(state) != ALL {
if notified != count(state) {
// We haven't notified enough threads, so wake up `tail`.

let prev = unsafe { tail.as_ref().prev.get().read() };

unsafe {
Node::notify(tail);
}

notified += 1;

if let Some(prev) = prev {
tail = prev;
} else {
// We notified all threads in the queue. As long as no new
// nodes have been added, clear the state.
loop {
match self.state.compare_exchange_weak(state, EMPTY, Release, Acquire) {
Ok(_) => return,
Err(new) => state = new,
}

let head = unsafe { to_node(state) };
if head != tail {
// `head` has already been woken up, so we may not
// access it. Simply continue the main loop with
// the last new node.
tail = unsafe { scan_until_known(head, tail) };
break;
}
}
}
} else {
// We notified enough threads. Try clearing the counter.

let head = unsafe { to_node(state) };
unsafe {
head.as_ref().tail.get().write(Some(tail));
}

match self.state.compare_exchange_weak(state, state.mask(MASK), Release, Acquire) {
Ok(_) => return,
Err(new) => state = new,
}

let scan = unsafe { to_node(state) };
if scan != head {
// New nodes have been added to the queue. Link the new part
// of the queue to the old one.
let scan = unsafe { scan_until_known(scan, head) };
unsafe {
head.as_ref().prev.get().write(Some(scan));
}
}
}
}

// We need to wake up all threads in the queue.
// Use a swap to reset the state so that we do not endlessly retry if
// new nodes are constantly being added.

let new = self.state.swap(EMPTY, Acquire);
let head = unsafe { to_node(state) };
let scan = unsafe { to_node(new) };
if head != scan {
// New nodes have been added to the queue. Link the new part
// of the queue to the old one.
let scan = unsafe { scan_until_known(scan, head) };
unsafe {
head.as_ref().prev.get().write(Some(scan));
}
}

unsafe { notify_all(tail) }
}

pub fn notify_all(&self) {
let mut state = self.state.load(Relaxed);
loop {
if state == EMPTY {
return;
}

if count(state) == ALL {
// All threads are already being notified.
return;
} else if count(state) != 0 {
// Another thread is handling notifications, tell it to notify
// all threads.
let next = state.map_addr(|state| state | ALL);
match self.state.compare_exchange_weak(state, next, Relaxed, Relaxed) {
Ok(_) => return,
Err(new) => state = new,
}
} else {
// Take the whole queue and wake it up.
match self.state.compare_exchange_weak(state, EMPTY, Acquire, Relaxed) {
Ok(_) => break,
Err(new) => state = new,
}
}
}

let tail = unsafe { scan_until_tail(to_node(state)) };
unsafe { notify_all(tail) }
}
}
45 changes: 0 additions & 45 deletions library/std/src/sys/sync/condvar/sgx.rs

This file was deleted.

100 changes: 0 additions & 100 deletions library/std/src/sys/sync/condvar/teeos.rs

This file was deleted.

50 changes: 0 additions & 50 deletions library/std/src/sys/sync/condvar/windows7.rs

This file was deleted.

148 changes: 0 additions & 148 deletions library/std/src/sys/sync/condvar/xous.rs

This file was deleted.

3 changes: 2 additions & 1 deletion library/std/src/sys/sync/mutex/mod.rs
Original file line number Diff line number Diff line change
@@ -19,10 +19,11 @@ cfg_if::cfg_if! {
target_os = "teeos",
))] {
mod pthread;
#[allow(unused_imports)]
pub use pthread::{Mutex, raw};
} else if #[cfg(all(target_os = "windows", target_vendor = "win7"))] {
mod windows7;
pub use windows7::{Mutex, raw};
pub use windows7::Mutex;
} else if #[cfg(all(target_vendor = "fortanix", target_env = "sgx"))] {
mod sgx;
pub use sgx::Mutex;
8 changes: 4 additions & 4 deletions library/std/src/sys/sync/mutex/pthread.rs
Original file line number Diff line number Diff line change
@@ -103,15 +103,15 @@ impl Mutex {
}

#[inline]
pub unsafe fn lock(&self) {
pub fn lock(&self) {
#[cold]
#[inline(never)]
fn fail(r: i32) -> ! {
let error = Error::from_raw_os_error(r);
panic!("failed to lock mutex: {error}");
}

let r = libc::pthread_mutex_lock(raw(self));
let r = unsafe { libc::pthread_mutex_lock(raw(self)) };
// As we set the mutex type to `PTHREAD_MUTEX_NORMAL` above, we expect
// the lock call to never fail. Unfortunately however, some platforms
// (Solaris) do not conform to the standard, and instead always provide
@@ -131,8 +131,8 @@ impl Mutex {
}

#[inline]
pub unsafe fn try_lock(&self) -> bool {
libc::pthread_mutex_trylock(raw(self)) == 0
pub fn try_lock(&self) -> bool {
unsafe { libc::pthread_mutex_trylock(raw(self)) == 0 }
}
}

2 changes: 1 addition & 1 deletion library/std/src/sys/sync/mutex/xous.rs
Original file line number Diff line number Diff line change
@@ -36,7 +36,7 @@ impl Mutex {
}

#[inline]
pub unsafe fn lock(&self) {
pub fn lock(&self) {
// Try multiple times to acquire the lock without resorting to the ticktimer
// server. For locks that are held for a short amount of time, this will
// result in the ticktimer server never getting invoked. The `locked` value
13 changes: 11 additions & 2 deletions library/std/src/thread/mod.rs
Original file line number Diff line number Diff line change
@@ -1099,7 +1099,7 @@ impl Drop for PanicGuard {
#[stable(feature = "rust1", since = "1.0.0")]
pub fn park() {
let guard = PanicGuard;
// SAFETY: park_timeout is called on the parker owned by this thread.
// SAFETY: park is called on the parker owned by this thread.
unsafe {
current().park();
}
@@ -1168,7 +1168,7 @@ pub fn park_timeout(dur: Duration) {
let guard = PanicGuard;
// SAFETY: park_timeout is called on the parker owned by this thread.
unsafe {
current().inner.as_ref().parker().park_timeout(dur);
current().park_timeout(dur);
}
// No panic occurred, do not abort.
forget(guard);
@@ -1361,6 +1361,15 @@ impl Thread {
unsafe { self.inner.as_ref().parker().park() }
}

/// Like the public [`park_timeout`], but callable on any handle. This is used to
/// allow parking in TLS destructors.
///
/// # Safety
/// May only be called from the thread to which this handle belongs.
pub(crate) unsafe fn park_timeout(&self, dur: Duration) {
unsafe { self.inner.as_ref().parker().park_timeout(dur) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The public park_timeout has a PanicGuard -- should this also have that?

It looks like the ~only advantage to this is that we avoid re-fetching the handle from thread local storage, is that just a performance optimization?

}

/// Atomically makes the handle's token available if it is not already.
///
/// Every thread is equipped with some basic low-level blocking support, via
Loading