Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use tagged pointers on the stack in the default build. #127705

Open
markshannon opened this issue Dec 6, 2024 · 11 comments
Open

Use tagged pointers on the stack in the default build. #127705

markshannon opened this issue Dec 6, 2024 · 11 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@markshannon
Copy link
Member

markshannon commented Dec 6, 2024

Currently all references to objects in frameobjects use _PyStackRef instead of PyObject *.
This is necessary for the free-threaded build to support deferred references.

For the default build _PyStackRef is just an alias for PyObject *.
We should change _PyStackRef to use proper tagged pointers in the default build for two important reasons:

  • It will reduce the maintenance burden of using tagged pointers if they were the same in both builds
  • It offers a lot of optimization potential. The overhead of reference counting operations is large, and tagged pointers will allow us to reduce that overhead considerably.

My initial implementation is 0.8% slower, although I'd like to get that closer to 0 before merging anything. There is some speedup in the GC due to streamlined immortality checks, and some slowdown due to increased overhead of turning new PyObject * references into _PyStackRefs.

This small slowdown will allow us a large speedup (maybe more than 5%) as we can do the following:

  • Reduce the overhead of refcount operations by using tagged references for the majority of LOAD_ instructions in the interpreter.
  • Completely eliminate many decref operations by tracking which references are tagged in the JIT.

The tagging scheme:

Tag Meaning
00 Normal pointers
01 Pointers with embedded reference count
10 Unused
11 Pointer to immortal object1 (including NULL)

This tagging scheme is chosen as it provides the best performance for the most common operations:

  • PyStackRef_DUP: Can check to see if the object's reference count needs updating with a single check and no memory read: ptr & 1
  • PyStackRef_CLOSE: As for PyStackRef_DUP, only a single bit check is needed
  • PyStackRef_XCLOSE: Since NULL is treated as immortal and tagged, this is the same as PyStackRef_CLOSE.

Maintaining the invariant that tag 11 is used for all immortal objects is a bit expensive, but can be mitigated by pushing the conversion from PyObject * to _PyStackRef down to a point where it is known whether an object is newly created or not.
For newly created objects PyStackRef_FromPyObjectStealMortal can be used which performs no immortality check.


  1. Actually, any object that was immortal when the reference was created. References to objects that are made immortal after the reference is created would have the low bits set to 00, or 01. This is OK as immortal refcounts have a huge margin of error and the number of possible references to one of these immortal objects is very small.

Linked PRs

@picnixz picnixz added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Dec 6, 2024
@markshannon
Copy link
Member Author

Stackrefs on the heap.

We use _PyStackRef for all locals variable and a couple of other frame fields, and frames can be present in paused generators.
This means that we can have _PyStackRefs in the heap. This is unsafe in general, but we can make stack refs safe to be stored in the heap provided that they do not have an embedded reference count, i.e. tag 01.

The free-threaded build allows any _PyStackRefs in the heap, but needs support from the cyclic GC and memory allocator to do so. For now, we make sure that no heap allocated _PyStackRef can be tagged 01.

@colesbury
Copy link
Contributor

Maintaining the invariant that tag 11 is used for all immortal objects is a bit expensive...

One challenge here is that objects can become immortal dynamically, such as by reference count overflow. That makes relying on bits in the _PyStackRef to determine immortality more difficult.

@markshannon
Copy link
Member Author

That isn't a problem. The implication tag == 0b11 => is_immortal(obj) is still true.
We don't rely on is_immortal(obj) => tag == 0b11 anywhere.

@colesbury
Copy link
Contributor

is_immortal(obj) => tag == 0b11 and tag != 0b11 => !is_immortal(obj) are logically equivalent.

The unsafe assumption is:

tag == 0 => !is_immortal(obj)

static inline void
PyStackRef_CLOSE(_PyStackRef ref)
{
    assert(!PyStackRef_IsNull(ref));
    if (PyStackRef_IsUncountedMortal(ref)) {  // (ref.bits & Py_TAG_BITS) == 0
        Py_DECREF_MORTAL(BITS_TO_PTR(ref));
    }
}
static inline void Py_DECREF_MORTAL(const char *filename, int lineno, PyObject *op)
{
    if (op->ob_refcnt <= 0) {
        _Py_NegativeRefcount(filename, lineno, op);
    }
    _Py_DECREF_STAT_INC();
    assert(!_Py_IsStaticImmortal(op));
    _Py_DECREF_DecRefTotal();
    if (--op->ob_refcnt == 0) { // not safe !!!
        _Py_Dealloc(op);
    }
}

markshannon added a commit that referenced this issue Mar 5, 2025
* Add location information when accessing already closed stackref

* Add #def option to track closed stackrefs to provide precise information for use after free and double frees.
@markshannon
Copy link
Member Author

You say that if (--op->ob_refcnt == 0) is unsafe, but it is safe if the object is mortal. The function is called Py_DECREF_MORTAL, so should only be only applied to mortal objects.

The problem is when objects become immortal by overflowing their refcount. These objects have never been declared immortal, just happen to have a reference count that looks like they are immortal. The problem is not the decref alone, but the decref combined with the missing increfs due to an object becoming temporarily immortal.
I've re-opened #125174 for this.

@encukou
Copy link
Member

encukou commented Mar 11, 2025

bigmem buildbots are still failing:

$ ./python -m test test_bigmem -M 60G -v -m '*test_append*'
[...]
0:00:00 load avg: 5.93 [1/1] test_bigmem
test_append (test.test_bigmem.ListTest.test_append) ... 
 ... expected peak memory use: 18.0G
 ... process data size: 16.0G
 ... process data size: 16.0G
 ... process data size: 16.0G
 ... process data size: 16.0G
 ... process data size: 16.0G
 ... process data size: 16.0G
 ... process data size: 18.0G
 ... process data size: 18.0G
 ... process data size: 18.0G
 ... process data size: 18.0G
./Include/refcount.h:564: _Py_NegativeRefcount: Assertion failed: object has negative ref count
<object at 0x7fc098777370 is freed>
Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
Python runtime state: initialized

Current thread 0x00007fc0a65ed740 (most recent call first):
  File "/home/encukou/dev/cpython/Lib/test/support/__init__.py", line 1149 in wrapper
  File "/home/encukou/dev/cpython/Lib/unittest/case.py", line 606 in _callTestMethod
  File "/home/encukou/dev/cpython/Lib/unittest/case.py", line 660 in run
  File "/home/encukou/dev/cpython/Lib/unittest/case.py", line 716 in __call__
  File "/home/encukou/dev/cpython/Lib/unittest/suite.py", line 122 in run
  File "/home/encukou/dev/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/home/encukou/dev/cpython/Lib/unittest/suite.py", line 122 in run
  File "/home/encukou/dev/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/home/encukou/dev/cpython/Lib/unittest/runner.py", line 259 in run
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/single.py", line 84 in _run_suite
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/single.py", line 42 in run_unittest
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/single.py", line 162 in test_func
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/single.py", line 118 in regrtest_runner
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/single.py", line 165 in _load_run_test
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/single.py", line 210 in _runtest_env_changed_exc
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/single.py", line 319 in _runtest
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/single.py", line 348 in run_single_test
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/main.py", line 378 in run_test
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/main.py", line 408 in run_tests_sequentially
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/main.py", line 550 in _run_tests
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/main.py", line 585 in run_tests
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/main.py", line 757 in main
  File "/home/encukou/dev/cpython/Lib/test/libregrtest/main.py", line 765 in main
  File "/home/encukou/dev/cpython/Lib/test/__main__.py", line 2 in <module>
  File "/home/encukou/dev/cpython/Lib/runpy.py", line 88 in _run_code
  File "/home/encukou/dev/cpython/Lib/runpy.py", line 198 in _run_module_as_main

Extension modules: _testinternalcapi (total: 1)

@markshannon
Copy link
Member Author

I'm looking into it.

@markshannon
Copy link
Member Author

The additional cost of testing the object for immortality in the few cases where we need to distinguish between a reference to an immortal object and a reference with an embedded reference count turns out to be cheap.
This frees up the second bit for tagged integers.

The tagging scheme:

Tag Meaning
00 Normal pointers
01 Pointers with embedded reference count, including immortal objects and NULL
10 Unused
11 Tagged integers

@encukou
Copy link
Member

encukou commented Mar 13, 2025

Now, only the tracerefs build is still failing.

encukou pushed a commit that referenced this issue Mar 14, 2025
This is missing `_PyReftracerTrack` calls, see gh-131238.
Merging as-is for the 3.14.0a6 release.
markshannon added a commit that referenced this issue Mar 17, 2025
plashchynski pushed a commit to plashchynski/cpython that referenced this issue Mar 17, 2025
…1198)

This is missing `_PyReftracerTrack` calls, see pythongh-131238.
Merging as-is for the 3.14.0a6 release.
plashchynski pushed a commit to plashchynski/cpython that referenced this issue Mar 17, 2025
@vstinner
Copy link
Member

The change commit fd545d7 broke Linux TraceRefs buildbot: https://buildbot.python.org/#/builders/484/builds/6705

commit fd545d735d5f9c048f99767c700f72853a9b7acd (HEAD)
Author: Mark Shannon <mark@hotpy.org>
Date:   Mon Mar 17 17:23:50 2025 +0000

    GH-127705: Move mortal decrefs to internal header and make sure _PyReftracerTrack is called

@vstinner
Copy link
Member

The change commit fd545d7 broke Linux TraceRefs buildbot: https://buildbot.python.org/#/builders/484/builds/6705

Fixed by 684a759.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants