-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We need unambiguous, meaningful naming of stack reference operations dependent on how counting is done #130789
Comments
We also need names for the use and lifetime of references. #130708 introduces the concept of references that depend, for correctness, on the lifetime of another reference outliving that reference. With that in mind, we can use the term "deferred" for the embedded reference count when it relies on the GC to prevent premature reclamation, and "scoped" for the embedded reference count when it depends on the scope of the reference to prevent premature reclamation. Note: there are no "deferred references", only deferred reference counts. We should never use the term "borrowed", as that is used for The term "scoped" was suggested by @nascheme. |
Names for the kinds of counts: The problem with "embedded" and "immediate" is that they do not form a pair, making them harder to remember. Maybe "external" and "internal" are more appropriate:
|
Thanks for pushing for consistency in how we talk about these concepts. I think that sticking with existing names will be clearer than introducing new ones: A stackref whose lifetime must not exceed another stackref's lifetime is borrowed and is created by calling A stackref that does not update the reference count of the referenced object on creation/destruction defers the reference count update. Such stackrefs are deferred. The use of "defer" here is consistent with our use in "deferred reference counting," in that the application of the reference count updates have been deferred. A minimal interface using these names might look like: // Create a new stackref
_PyStackRef PyStackRef_FromPyObjectNew(PyObject *obj);
// Create a copy of a stackref
_PyStackRef PyStackRef_DUP(_PyStackRef stackref);
// Retrieve the PyObject * without changing the reference count on the object
PyObject *PyStackRef_AsPyObjectBorrow(_PyStackRef stackref);
// Create a stackref whose lifetime must not exceed that of `stackref`
_PyStackRef PyStackRef_Borrow(_PyStackRef stackref);
// Has the reference count update on the referenced object been deferred?
bool PyStackRef_IsDeferred(_PyStackRef stackref);
bool PyStackRef_IsHeapSafe(_PyStackRef stackref) {
if (PyStackRef_IsDeferred(stackref)) {
PyObject *obj = PyStackRef_AsPyObjectBorrow(stackref);
return PyObject_HasDeferredRefcount(obj) || _Py_IsImmortal(obj);
}
return true;
}
_PyStackRef PyStackRef_MakeHeapSafe(_PyStackRef stackref) {
if (PyStackRef_IsHeapSafe(stackref)) {
return stackref;
}
return PyStackRef_FromPyObjectNew(PyStackRef_AsPyObjectBorrow(stackref));
}
void PyStackRef_CLOSE(_PyStackRef stackref); Tagging a few other folks who have expressed opinions (sorry!): @nascheme @brandtbucher @Yhg1s @colesbury |
FWIW, coming at this as someone who wasn't exposed to most of the new interpreter design until recently, "borrow" and "borrowed references" are immediately clear to me, and they behave exactly as I expected. "Deferred references" took a bit of understanding of how the bookkeeping was handled, but the concept was pretty apparent from the name. ("Immediate" references are fine as well, although my immediate reaction was "why does that need a name, that's just references".) "Embedded" and "virtual" references are meaningless to me. We already use "embedded" in other contexts in Python, and virtual is so overloaded in computing in general it's not a good term for anything anymore. But all things considered, they're not worse than any other non-obvious name. "Borrowed" and "deferred" definitely has my vote. |
faster-cpython/ideas#700 describes three ways to count references:
Virtual
references are references that are know to exist to the relevant code generator, but are elided at runtime, so no API is needed for them.Embedded
references are marked by bit(s) in the reference and not in theob_refcount
field of the object.Immediate
references are counted in theob_refcount
(or free-threading equivalent) field of the object.To this we should add
uncounted
which are references to immortal objects (includingNULL
).Note that it is possible to have embedded or immediate references to immortal objects if the object was mortal when the reference, or reference this reference was created from, was created.
Why this matters
It is important that the use of references is understandable without referring to the implementation and we have multiple implementations of stackrefs, so the interface needs to be clear.
Multiple implementations
Even when we merge the free-threading and default implementations of stackrefs, we will still have the
Py_STACKREF_DEBUG
implementation which is very different and vital to finding reference errors.Examples:
When creating an embedded stackref from another stackref, we should use
PyStackRef_DUP_Embedded
which has the same semantics asPyStackRef_DUP
but creates an embedded reference if the implementation supports it.There are circumstance when a method of counting is not safe. E.g. using embedded references in the heap is not safe. For that we will want to physically transform a reference without a logic change in ownership.
E.g.
PyStackRef_ToNonEmbedded
. In terms of ownership, this a no-op,PyStackRef_ToNonEmbedded(ref)
is equivalentref
, but ensures that any embedded count is turned into an immediate count.We probably should only use
uncounted
when referring to references in docs and comments, as we already havePyStackRef_FromPyObjectImmortal
, there is no need forPyStackRef_FromPyObjectUncounted
as well.The text was updated successfully, but these errors were encountered: