-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiprocessing: the Resource Tracker process is never reaped #88887
Comments
The multiprocessing.resource_tracker instance is never reaped, leaving zombie processes. There is a waitpid() call for the ResourceTracker's pid but it is in a private method _stop() which seems to be only called from some test modules. Usually environments have some process handling zombies but if python is the "main" process in a container, for example, and runs another python instance that does something leaking a ResourceTracker process, zombies start to accumulate. This is easily reproducible with a couple of small python programs as long as they are not run from a shell or another parent process that takes care of forgotten children. It was originally discovered in a docker container that has a python program as its entry point (celery worker in an airflow container) running other python programs (dbt). The minimal code is available on Github here: https://github.com/viktorvia/python-multi-issue The attached multi.py is leaking resource tracker processes, but just running it from a full-fledged development environment will not show the issue. Instead, run it via another python program from a Docker container: Dockerfile: WORKDIR /usr/src/multi COPY . ./ CMD ["python", "main.py"] main.py: from subprocess import run
from time import sleep
while True:
result = run(["python", "multi.py"], capture_output=True)
print(result.stdout.decode('utf-8'))
result = run(["ps", "-ef", "--forest"], capture_output=True)
print(result.stdout.decode('utf-8'), flush=True)
sleep(1) When the program is run it will accumulate 1 zombie on each run: $ docker run -it multi python main.py
[1, 4, 9] UID PID PPID C STIME TTY TIME CMD [1, 4, 9] UID PID PPID C STIME TTY TIME CMD [1, 4, 9] UID PID PPID C STIME TTY TIME CMD [1, 4, 9] UID PID PPID C STIME TTY TIME CMD Running from a shell script, or just another python program that handles SIGCHLD by calling wait() takes care of the zombies. |
The same happens for |
we've fixed this issue by calling the below on our code: import os
import signal
from types import FrameType
def reap_children(signum: int, frame: Optional[FrameType]) -> None:
try:
# -1 meaning wait for any child process, see https://linux.die.net/man/2/waitpid
# WNOHANG for waitpid to return immediately instead of waiting, if there is no child process ready to be noticed.
# see https://www.gnu.org/software/libc/manual/html_node/Process-Comption.html#index-WNOHANG
while os.waitpid(-1, os.WNOHANG)[0] > 0:
pass
except ChildProcessError:
pass
# SIGCHLD is sent to a process to indicate a child process stopped or terminated
# https://man7.org/linux/man-pages/man7/signal.7.html
signal.signal(signal.SIGCHLD, reap_children) |
…cker` upon deletion (pythonGH-130429) (cherry picked from commit f53e7de) Co-authored-by: luccabb <32229669+luccabb@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Gregory P. Smith <greg@krypto.org>
…acker` upon deletion (GH-130429) (#131516) gh-88887: Cleanup `multiprocessing.resource_tracker.ResourceTracker` upon deletion (GH-130429) (cherry picked from commit f53e7de) Co-authored-by: luccabb <32229669+luccabb@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Gregory P. Smith <greg@krypto.org>
Fixed by f53e7de. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
multiprocessing.resource_tracker.ResourceTracker
upon deletion #130429multiprocessing.resource_tracker.ResourceTracker
upon deletion (GH-130429) #131516multiprocessing.resource_tracker.ResourceTracker
… (GH-130429) #131530The text was updated successfully, but these errors were encountered: