-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::process::exit
is not thread-safe in combination with C code calling exit
#126600
Comments
This has been discussed previously in #83994. There the decision was to not do anything about the concerns with Given that I just spend the better part of a weekend debugging a segfault that was caused by concurrent |
On that thread joshtriplett notes:
We do now mark |
Linux libc documentation does claim that |
Doesn't solve the problem with exit getting called from non-rust code. And no, an atexit handler won't help since it leaves a race-window and if you're in a situation where threads concurrently call exit you're already racing. https://github.com/MaterializeInc/database-issues/issues/6528 sounds like it's either a libc bug or the assessment from the previous thread that libc implementations provide the desired behavior - even if the standard language doesn't - needs to be revised. |
I'd guess the safe thing to do is kill all other threads then call |
Not sure if serious... Killing threads would make anything that accesses their stacks UB. Freezing them would work but that's difficult to implement reliably. Libc would be in a position to do it since they control pthreads. But then they might as well fix their locking. |
Adding 32 |
I'd say it makes sense to introduce a lock on the Rust side, since pure-Rust code shouldn't be able to cause UB by calling libc's This also means that Rust cdylibs mustn't call |
Based on how the glibc code is set up it very much looks like I think the previous assumption that glibc's |
That's explained by the fact that one block in the list of exit handlers contains 32 entries, and the last block is never freed. So to trigger the use-after-free you need at least two blocks in the list. glibc installs one exit handler on its own, so 32 more need to be installed to add a free-able block to the list. |
Triage: marking this as |
This comment from the main author of musl was pointed out in the libs-api meeting:
I looked up the spec to confirm and it seems to be correct. https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_01
|
According to the C11 standard, calling C11 7.22.4.4p2
Calling it from two threads at the same time is calling it twice. It is thus explicitly thread-unsafe in C11. |
The POSIX description also contains that sentence. https://pubs.opengroup.org/onlinepubs/009695299/functions/exit.html
So the POSIX spec is kind of in conflict with itself? |
@tbu- See the latter half of #83994 (comment) |
It makes sense to fix this on the C side, too. Currently, we have |
This should probably be reported as a bug to glibc, along with a C program that reproduces the issue. |
Here is a mostly-one-to-one C11 translation of the program in the OP that reproduces a segfault on my machine (Ubuntu 24.04, gcc 13.2.0-23ubuntu4, glibc 2.39-0ubuntu8.2), if someone wants to use it in a bug report. code#include <stdlib.h>
#include <threads.h>
void exit_handler(void) {
thrd_sleep(&(struct timespec){.tv_sec = 1}, NULL);
}
int thread_function(void *arg) {
(void)arg;
exit(0);
}
int main(void) {
for (int i = 0; i < 32; ++i) {
atexit(exit_handler);
}
for (int i = 0; i < 2; ++i) {
thrd_t thread;
thrd_create(&thread, thread_function, NULL);
}
} |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
I don't think it's exactly a bug in glibc, though, and reporting it that way might get the issue bounced. The Linux manual page for
So glibc is at least conforming to its own documentation. As @richfelker said in #83994 (comment), I do think it's worth trying to get this fixed over in the POSIX/C world, but I think that looks like engaging the folks who work on the POSIX and C standards to get this updated in the next version of those standards, and looping the glibc maintainers into those conversations for feedback. |
Then they can be marked non-unloadable so that |
If I can mark my libraries as non-unloadable in an even vaguely portable fashion, then I would like to sign up immediately. I don't think that's a common platform facility. |
For ELF-based platforms,
Libtool may have its own abstraction for how to mark libraries non-unloadable; I'm not sure. None of this is perfectly portable but I think it meets your requirement for "even vaguely portable". At worst you may need to add special cases for Windows and Mac. |
On Windows I’m not sure if making a library non-unloadable is an always an option, as it leaks the DLL’s handles. Fortunately, |
On macOS using thread local storage effectively blocks dlclose. (There are several issues on the rust issue tracker about rust dylibs not getting unloaded when using dlclose. This due to rust's standard library using TLS internally and it by default getting statically linked into dylibs.) |
Indeed. But on the other hand there doesn't seem to be any explicit "never unload" flag for macOS. No nice solution. |
So it seems what you need is a function that is called on Or alternatively, if supporting I am not sure what is the best platform to request features like that -- it'd have to come from libcs, right? |
Yes. My main concern is that the best use of this issue is, I think, continuing to coordinate these efforts. A concern that goes far enough afield, like advocacy for Rust adding an explicit API for exit handlers, is better taken to the actual decision-makers. For solving the issue of dynamic libraries having problems with dlclose, it sounds like we would want either something like "if this magic symbol is included in your dylib, it shouldn't be actually-unloaded at |
For ELF based platforms there is already |
Doesn't that require the person driving the linker to do it? It's not something the author can assure. And ELF is not under the purview of the Austin Group, I think. |
Fair. I feel like coordinating those efforts is out of scope of Rust, but it'd make sense for us to track progress here, and maybe remove our own Discussing how to write C libraries that can tolerate their exit handlers to be invoked at any time while also not leaking resources on |
exit: explain our expectations for the exit handlers registered in a Rust program This documents the position of `@Amanieu` and others in rust-lang#126600: a library with an atexit handler that destroys state that other threads could still be working on is buggy. We do not consider it acceptable for a library to say "you must call the following cleanup function before exiting from `main` or calling `exit`". I don't know if this is established `@rust-lang/libs-api` consensus so I presume this will have to go through FCP. Given that Rust supports concurrency, I don't think there is any way to write a sound Rust wrapper around a library that has such a required cleanup function: even if we made `exit` unsafe, and the Rust wrapper used the scope-with-callback approach to ensure it can run cleanup code before returning from the wrapper (like `thread::scope`), one could still call this wrapper in a second thread and then return from `main` while the wrapper runs. Making this sound would require `std` to provide a way to "block" returning from `main`, so that while the wrapper runs returning from `main` waits until the wrapper is done... that just doesn't seem feasible. The `exit` docs do not seem like the best place to document this, but I also couldn't think of a better one.
exit: explain our expectations for the exit handlers registered in a Rust program This documents the position of ``@Amanieu`` and others in rust-lang#126600: a library with an atexit handler that destroys state that other threads could still be working on is buggy. We do not consider it acceptable for a library to say "you must call the following cleanup function before exiting from `main` or calling `exit`". I don't know if this is established ``@rust-lang/libs-api`` consensus so I presume this will have to go through FCP. Given that Rust supports concurrency, I don't think there is any way to write a sound Rust wrapper around a library that has such a required cleanup function: even if we made `exit` unsafe, and the Rust wrapper used the scope-with-callback approach to ensure it can run cleanup code before returning from the wrapper (like `thread::scope`), one could still call this wrapper in a second thread and then return from `main` while the wrapper runs. Making this sound would require `std` to provide a way to "block" returning from `main`, so that while the wrapper runs returning from `main` waits until the wrapper is done... that just doesn't seem feasible. The `exit` docs do not seem like the best place to document this, but I also couldn't think of a better one.
Rollup merge of rust-lang#129581 - RalfJung:exit, r=joshtriplett exit: explain our expectations for the exit handlers registered in a Rust program This documents the position of ``@Amanieu`` and others in rust-lang#126600: a library with an atexit handler that destroys state that other threads could still be working on is buggy. We do not consider it acceptable for a library to say "you must call the following cleanup function before exiting from `main` or calling `exit`". I don't know if this is established ``@rust-lang/libs-api`` consensus so I presume this will have to go through FCP. Given that Rust supports concurrency, I don't think there is any way to write a sound Rust wrapper around a library that has such a required cleanup function: even if we made `exit` unsafe, and the Rust wrapper used the scope-with-callback approach to ensure it can run cleanup code before returning from the wrapper (like `thread::scope`), one could still call this wrapper in a second thread and then return from `main` while the wrapper runs. Making this sound would require `std` to provide a way to "block" returning from `main`, so that while the wrapper runs returning from `main` waits until the wrapper is done... that just doesn't seem feasible. The `exit` docs do not seem like the best place to document this, but I also couldn't think of a better one.
Even if C/POSIX standard states that exit is not formally thread-unsafe, calling it more than once is UB. The glibc already supports it for the single-thread, and both elf/nodelete2.c and tst-rseq-disable.c call exit from a DSO destructor (which is called by _dl_fini, registered at program startup with __cxa_atexit). However, there are still race issues when it is called more than once concurrently by multiple threads. A recent Rust PR triggered this issue [1], which resulted in an Austin Group ask for clarification [2]. Besides it, there is a discussion to make concurrent calling not UB [3], wtih a defined semantic where any remaining callers block until the first call to exit has finished (reentrant calls, leaving through longjmp, and exceptions are still undefined). For glibc, at least reentrant calls are required to be supported to avoid changing the current behaviour. This requires locking using a recursive lock, where any exit called by atexit() handlers resumes at the point of the current handler (thus avoiding calling the current handle multiple times). Checked on x86_64-linux-gnu and aarch64-linux-gnu. [1] rust-lang/rust#126600 [2] https://austingroupbugs.net/view.php?id=1845 [3] https://www.openwall.com/lists/libc-coord/2024/07/24/4 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This is a rather long thread, can we get a summary of the current status added to the original issue body? What are actionable steps that can be taken to resolve this unsound issue? AFAICT the status appears to be "wait until the rest of the world is fixed", which IMO is tantamount to closing this issue as WONTFIX, given the prospects of that ever happening. And I'm not entirely satisfied by the dismissal of ameliorations on the Rust side (either adding a lock or marking as |
The pure Rust fix would be to have some Rust API for adding on exit functions. This could side-step the issue entirely so long as you ignore C. Note that we already have a lock to mitigate this issue. However it's by no means a fix. |
Things are looking rosier than that. Posix, musl and glibc agreed to improve things and if I read the comments on the posix issue correctly they'll also bring it to the appropriate C committee. |
I have gathered a short summary and a few links from other comments into the top comment. Edit away if there's anything to correct. |
std::process::exit
is not thread-safestd::process::exit
is not thread-safe in combination with C code calling exit
Please, document in https://doc.rust-lang.org/nightly/std/process/fn.exit.html , on which platforms Also, as well as I understand, musl is shipped with Rust itself. This means nothing prevents us from patching it! So, I suggest patching all remaining similarly unsafe C functions ( Also, we can create alternative target with static glibc. We will patch that glibc however we want, we will fix all |
That is really something the respective platforms need to document. We shouldn't be in the business of providing a central repository of all the ways in which various C implementations promise more than the C standard. We track this for our own needs here, but that is not the same as making it part of the official docs. And we are also not interested in maintaining a patched fork of musl or glibc. |
What we should do, however, is document that it is UB to call this |
exit: document interaction with C Cc rust-lang#126600
exit: document interaction with C Cc rust-lang#126600
exit: document interaction with C Cc rust-lang#126600
exit: document interaction with C Cc rust-lang#126600
Calling
exit
concurrently from Rust and C (or from 2 different copies of the Rust runtime in the same binary) is UB. This is permitted by the C standard, and can actually cause trouble in practice since not all versions of all libc implementations do proper locking when traversing the list ofatexit
functions. For programs only using Rust, we mitigated this by adding a lock, but this only helps when all calls toexit
come from Rust (and they must come from the same Rust runtime). Note that returning frommain
is equivalent to callingexit
so it can also cause this issue.Current status
Mitigation on our end (only covers pure Rust programs): #126606
The current status is that the specification language stems from a time when C did not specify multithreading and the intent was to forbid reentrancy. External discussion indicates that there's intent to fix it both in libc implementations and the specs:
Original bugreport (partially outdated)
The current implementation of
std::process::exit
is unsound on Linux and possibly related platforms where it defers to libc'sexit()
function.exit()
is documented to be not thread-safe, and hencestd::process::exit
is not thread-safe either, despite being a non-unsafe
function.To show that this isn't just a theoretical problem, here is a minimal example that segfaults on my machine (Ubuntu with glibc 2.37):
The example contains
unsafe
code, but only to install exit handlers. AFAICT nothing about thelibc::atexit
call is unsafe. The UB is introduced by callingstd::process::exit
concurrently afterwards.If you are curious, https://github.com/MaterializeInc/database-issues/issues/6528 lays out what causes the segfault (it's a use-after-free). That's not terribly relevant though, given that glibc has no obligation to ensure
exit()
is thread safe when it's clearly documented not to be. Instead, Rust should either markstd::process::exit
asunsafe
(which is something it could do for the 2024 edition), or introduce locking to ensure only a single thread gets to callexit()
at a time.The text was updated successfully, but these errors were encountered: