Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-131492: gh-131461: handle exceptions in GzipFile constructor while owning resources #131462

Merged
merged 20 commits into from
Mar 20, 2025

Conversation

graingert
Copy link
Contributor

@graingert graingert commented Mar 19, 2025

@graingert graingert marked this pull request as ready for review March 19, 2025 14:28
@graingert graingert requested a review from vstinner March 19, 2025 14:38
@vstinner
Copy link
Member

@cmaloney: Would you mind to review this fix?

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to write a test with such broken file object?

@graingert graingert requested a review from ethanfurman as a code owner March 19, 2025 16:31
@graingert
Copy link
Contributor Author

Would it be possible to write a test with such broken file object?

There's already a test in test_tarfile: test_open_nonwritable_fileobj the failure will be enforced by #128973

I added check_no_resource_warning to make the assert hold until then

graingert and others added 2 commits March 19, 2025 17:10
Co-authored-by: Victor Stinner <vstinner@python.org>
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vstinner
Copy link
Member

The CI fails:

TypeError: check_no_resource_warning() missing 1 required positional argument: 'testcase'

Co-authored-by: Victor Stinner <vstinner@python.org>
@graingert
Copy link
Contributor Author

The CI fails:

TypeError: check_no_resource_warning() missing 1 required positional argument: 'testcase'

🤦

self.fileobj = fileobj
try:
if fileobj is None:
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
Copy link
Contributor Author

@graingert graingert Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a more thorough change inspired by

cpython/Lib/ssl.py

Lines 1014 to 1015 in a4832f6

# Now SSLSocket is responsible for closing the file descriptor.
try:

previously it was possible for this self.myfileobj to be left open if any of the constructor failed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than the try/except wrapping, why not add a flag which is set to False at the beginning of the constructor and True at the end, in __del__ (which is what emits the resource warning), check that flag. If it's not fully constructed then don't emit the resource warning?

The I/O stack (GzipFile inherits from IOBase) in IOBase.__del__ always does the close call if self.closed is False, so shouldn't need to manually add a special case here. GzipFile.closed is:

    @property
    def closed(self):
        return self.fileobj is None

(so if there is a fileobj set, close should always be called).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't help with myfileobj closing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using self.fileobj is not None as the "fully initialized" flag (already set to None always, only set to non-None at end of __init__), and checking around if self.filobj is None and self.myfileobj is not None in __del__?

It feels a lot like this is adding more cases / code around closing. Trying to find a way to get this change to be smaller / more minimal to reduce risks. IO stack construction, dealloc warnings, and close + destruction order is already really intricate and hard to understand in full.

Copy link
Contributor Author

@graingert graingert Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't rely on self.myfileobj being closed in __del__ and we don't return an instance of GzipFile for the caller to close so self.myfileobj needs to be closed at the end of __init__ if there's an exception: so we need the try/catch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right but this is closing objects that are inaccessible to the caller

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GzipFile / self is equally inaccessible, and cleaned up by its __del__ / finalizer tp_finalize (or tp_dealloc). It's guaranteed __del__ is called and able to release any held resources (such as self.myfileobj) see PEP-442.

I'm okay adding a cleanup for the case where self.fileobj is None and self.myfileobj is not None inside __del__. Matches the behavior of other io objects and is a lot smaller diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct, you cannot rely on __del__ to be called when an exception is raised because the exception traceback holds a reference to self

Copy link
Contributor

@cmaloney cmaloney Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That statement seems to contradict PEP 442 Predictability section to me...

Following this scheme, an object’s finalizer is always called exactly once, even if it was resurrected afterwards.
For CI objects, the order in which finalizers are called (step 2 above) is undefined.

Even if there is an exception traceback that keeps the reference to self, eventually that traceback will go out of scope and file will be closed / not leaked...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to @vstinner and @serhiy-storchaka for approval. From my perspective, this is a lot more complex of a bugfix than required, didn't match stated description and arguments around self.myfileobj. Happy to do quick fixes, can produce a much smaller patch here if desired with less behavior change.

@graingert graingert changed the title gh-131461: fix ResourceWarning when writing a unwritable gzipfile gh-131492: gh-131461: fix ResourceWarning when writing a unwritable gzipfile Mar 20, 2025
@graingert graingert added needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Mar 20, 2025
@graingert graingert changed the title gh-131492: gh-131461: fix ResourceWarning when writing a unwritable gzipfile gh-131492: gh-131461: handle exceptions in GzipFile constructor while owning resources Mar 20, 2025
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I would add support.gc_collect() in tests.

graingert and others added 2 commits March 20, 2025 16:40
…aC2cA.rst

Co-authored-by: Victor Stinner <vstinner@python.org>
…aC2cA.rst

Co-authored-by: Victor Stinner <vstinner@python.org>
@vstinner vstinner enabled auto-merge (squash) March 20, 2025 16:49
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vstinner vstinner merged commit ce79274 into python:main Mar 20, 2025
39 checks passed
@miss-islington-app
Copy link

Thanks @graingert for the PR, and @vstinner for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Mar 20, 2025
…ructor while owning resources (pythonGH-131462)

(cherry picked from commit ce79274)

Co-authored-by: Thomas Grainger <tagrain@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
@bedevere-app
Copy link

bedevere-app bot commented Mar 20, 2025

GH-131518 is a backport of this pull request to the 3.13 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Mar 20, 2025
…ructor while owning resources (pythonGH-131462)

(cherry picked from commit ce79274)

Co-authored-by: Thomas Grainger <tagrain@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Mar 20, 2025
@bedevere-app
Copy link

bedevere-app bot commented Mar 20, 2025

GH-131519 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Mar 20, 2025
@vstinner
Copy link
Member

Hum, the backports mention ResourceWarning, whereas ResourceWarning only exists in Python 3.14. If we want to backport this change to 3.13 and 3.12, maybe references to ResourceWarning should be removed?

What do you think @graingert?

@graingert
Copy link
Contributor Author

graingert commented Mar 20, 2025

Oh yeah nix the 3.14 specific bits of the changelog from the backports

@graingert graingert deleted the fix-resource-warning-in-gzipfile-ctor branch March 20, 2025 17:35
@graingert
Copy link
Contributor Author

I've removed the references to a ResourceWarning in the backports

vstinner added a commit that referenced this pull request Mar 21, 2025
…r while owning resources (GH-131462) (#131518)

(cherry picked from commit ce79274)

Co-authored-by: Thomas Grainger <tagrain@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit that referenced this pull request Mar 21, 2025
…r while owning resources (GH-131462) (#131519)

(cherry picked from commit ce79274)

Co-authored-by: Thomas Grainger <tagrain@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants