-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data races in frame inspection and tracebacks #128421
Comments
Hmmm... this looks like a TSan report during a crash. Faulthandler's dumping of threads isn't thread-safe and we should deal with that (most likely by only printing the stack trace of the faulting thread), but the important thing is that |
✦ ❯ env TSAN_OPTIONS=suppressions={$PWD}/Tools/tsan/suppressions_free_threading.txt ./python -m test test_asyncio.test_free_threading -j4 -F
Using random seed: 3369581851
0:00:00 load avg: 6.42 Run tests in parallel using 4 worker processes
0:00:24 load avg: 7.00 [ 1] test_asyncio.test_free_threading passed
0:00:24 load avg: 7.00 [ 2] test_asyncio.test_free_threading passed
0:00:25 load avg: 7.00 [ 3] test_asyncio.test_free_threading passed
0:00:25 load avg: 7.00 [ 4] test_asyncio.test_free_threading passed
0:00:50 load avg: 8.56 [ 5] test_asyncio.test_free_threading passed
0:00:50 load avg: 8.56 [ 6] test_asyncio.test_free_threading passed
0:00:51 load avg: 8.56 [ 7] test_asyncio.test_free_threading passed
0:01:18 load avg: 7.61 [ 8] test_asyncio.test_free_threading passed -- running (1): test_asyncio.test_free_threading (53.0 sec)
0:01:18 load avg: 7.61 [ 9] test_asyncio.test_free_threading passed -- running (1): test_asyncio.test_free_threading (53.8 sec)
0:01:19 load avg: 7.61 [ 10] test_asyncio.test_free_threading passed -- running (1): test_asyncio.test_free_threading (54.0 sec)
0:01:25 load avg: 7.56 [ 11/1] test_asyncio.test_free_threading worker non-zero exit code (Exit code -6 (SIGABRT))
Debug memory block at address p=0x7f6a1e112930: API 'o'
176 bytes originally requested
The 7 pad bytes at p-7 are FORBIDDENBYTE, as expected.
The 8 pad bytes at tail=0x7f6a1e1129e0 are not all FORBIDDENBYTE (0xfd):
at tail+0: 0x0c *** OUCH
at tail+1: 0xfd
at tail+2: 0xfd
at tail+3: 0xfd
at tail+4: 0xfd
at tail+5: 0xfd
at tail+6: 0xfd
at tail+7: 0xfd
Data at p: 00 00 00 00 00 00 00 00 ... 0a 0b 08 09 07 04 03 02
Enable tracemalloc to get the memory block allocation traceback
Fatal Python error: _PyMem_DebugRawFree: bad trailing pad byte
Python runtime state: initialized
Thread 0x00007f6a02bfe6c0 (most recent call first):
File "/home/realkumaraditya/cpython/Lib/asyncio/futures.py", line 183 in done
python: ./Include/internal/pycore_frame.h:88: _PyFrame_GetCode: Assertion `PyCode_Check(executable)' failed.
Fatal Python error: Aborted
<Cannot show all threads while the GIL is disabled>
Stack (most recent call first):
File "/home/realkumaraditya/cpython/Lib/asyncio/runners.py", line 208 in _cancel_all_tasks
File "/home/realkumaraditya/cpython/Lib/asyncio/runners.py", line 71 in close
File "/home/realkumaraditya/cpython/Lib/asyncio/runners.py", line 63 in __exit__
File "/home/realkumaraditya/cpython/Lib/test/test_asyncio/test_free_threading.py", line 46 in runner
File "/home/realkumaraditya/cpython/Lib/threading.py", line 996 in run
File "/home/realkumaraditya/cpython/Lib/threading.py", line 1054 in _bootstrap_inner
File "/home/realkumaraditya/cpython/Lib/threading.py", line 1016 in _bootstrap
Kill <WorkerThread #2 running test=test_asyncio.test_free_threading pid=11853 time=6.8 sec> process group
Kill <WorkerThread #3 running test=test_asyncio.test_free_threading pid=11860 time=6.5 sec> process group
Kill <WorkerThread #4 running test=test_asyncio.test_free_threading pid=11852 time=7.5 sec> process group
== Tests result: FAILURE ==
1 test failed:
test_asyncio.test_free_threading
10 tests OK.
Total duration: 1 min 25 sec
Total tests: run=160
Total test files: run=11 failed=1
Result: FAILURE @colesbury I wrote more thread safety tests for asyncio at #128480 and I see this crash sometimes. I'll try to write more tests to find these. |
This makes more operations on frame objects thread-safe in the free threaded build, which fixes some data races that occurred when passing exceptions between threads. However, accessing local variables from another thread while its running is still not thread-safe and may crash the interpreter.
This makes more operations on frame objects thread-safe in the free threaded build, which fixes some data races that occurred when passing exceptions between threads. However, accessing local variables from another thread while its running is still not thread-safe and may crash the interpreter.
This makes more operations on frame objects thread-safe in the free threaded build, which fixes some data races that occurred when passing exceptions between threads. However, accessing local variables from another thread while its running is still not thread-safe and may crash the interpreter.
This tells TSAN not to sanitize PyUnstable_InterpreterFrame_GetLine(). There's a possible data race on the access to the frame's instr_ptr if the frame is currently executing. We don't really care about the race. In theory, we could use relaxed atomics for every access to `instr_ptr`, but that would create more code churn and current compilers are overly conservative with optimizations around relaxed atomic accesses. We also don't sanitize _PyFrame_IsIncomplete() because it accesses `instr_ptr` and is called from assertions within PyFrame_GetCode().
This tells TSAN not to sanitize PyUnstable_InterpreterFrame_GetLine(). There's a possible data race on the access to the frame's instr_ptr if the frame is currently executing. We don't really care about the race. In theory, we could use relaxed atomics for every access to `instr_ptr`, but that would create more code churn and current compilers are overly conservative with optimizations around relaxed atomic accesses. We also don't sanitize _PyFrame_IsIncomplete() because it accesses `instr_ptr` and is called from assertions within PyFrame_GetCode().
This tells TSAN not to sanitize `PyUnstable_InterpreterFrame_GetLine()`. There's a possible data race on the access to the frame's `instr_ptr` if the frame is currently executing. We don't really care about the race. In theory, we could use relaxed atomics for every access to `instr_ptr`, but that would create more code churn and current compilers are overly conservative with optimizations around relaxed atomic accesses. We also don't sanitize `_PyFrame_IsIncomplete()` because it accesses `instr_ptr` and is called from assertions within PyFrame_GetCode().
The recent PR pythongh-131479 added locking to `take_ownership` in the free threading build. The cost is not really the locking -- that path isn't taken frequently -- but the inlined code causes extra register spills and slows down RETURN_VALUE, even when it's not taken. Mark `take_ownership` as `Py_NO_INLINE` to avoid the regression.
The recent PR pythongh-131479 added locking to `take_ownership` in the free threading build. The cost is not really the locking -- that path isn't taken frequently -- but the inlined code causes extra register spills and slows down RETURN_VALUE, even when it's not taken. Mark `take_ownership` as `Py_NO_INLINE` to avoid the regression. Also limit locking in PyFrameObject to Python functions, not the C API.
Run the tests with #128147 and tsan enabled:
env TSAN_OPTIONS=suppressions={$PWD}/Tools/tsan/suppressions_free_threading.txt ./python -m test test_asyncio -F
TSAN Warnings:
Linked PRs
BaseException
thread safe #128728traceback.tb_next
#131322The text was updated successfully, but these errors were encountered: