Message 356070 - Python tracker

Message356070

Author jaketesler
Recipients jaketesler, vstinner
Date 2019-11-05.22:39:45
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1572993586.01.0.941321705815.issue38707@roundup.psfhosted.org>
In-reply-to
Content
I have encountered a minor bug with the new `threading.get_native_id()` featureset in Python 3.8. The bug occurs when creating a new multiprocessing.Process object on Unix (or on any platform where the multiprocessing start_method is 'fork' or 'forkserver').

When creating a new process via fork, the Native ID in the new MainThread is incorrect. The new forked process' threading.MainThread object inherits the Native ID from the parent process' MainThread instead of capturing/updating its own (new) Native ID.

See the following snippet:

>>> import threading, multiprocessing
>>> multiprocessing.set_start_method('fork') # or 'forkserver'
>>> def proc(): print(threading.get_native_id(), threading.main_thread().native_id) # get_native_id(), mainthread.native_id
>>> proc()
22605 22605 # get_native_id(), mainthread.native_id
>>> p = multiprocessing.Process(target=proc)
>>> p.start()
22648 22605 # get_native_id(), mainthread.native_id
>>>
>>> def update(): threading.main_thread()._set_native_id()
>>> def print_and_update(): proc(); update(); proc()
>>> print_and_update()
22605 22605 # get_native_id(), mainthread.native_id
22605 22605 
>>> p2=multiprocessing.Process(target=print_and_update); p2.start()
22724 22605 # get_native_id(), mainthread.native_id
22724 22724
>>> print_and_update()
22605 22605 # get_native_id(), mainthread.native_id
22605 22605

As you can see, the new Process object's MainThread.native_id attribute matches that of the MainThread of its parent process. 

Unfortunately, I'm not too familiar with the underlying mechanisms that Multiprocessing uses to create forked processes. 
I believe this behavior occurs because (AFAIK) a forked multiprocessing.Process copies the MainThread object from its parent process, rather than reinitializing a new one. Looking further into the multiprocessing code, it appears the right spot to fix this would be in the multiprocessing.Process.bootstrap() function. 

I've created a branch containing a working fix - I'm also open to suggestions of how a fix might otherwise be implemented. 
If it looks correct I'll create a PR against the CPython 3.8 branch. 

See the branch here: https://github.com/jaketesler/cpython/tree/fix-mp-native-id

Thanks all!
-Jake
History
Date User Action Args
2019-11-05 22:39:46jaketeslersetrecipients: + jaketesler, vstinner
2019-11-05 22:39:46jaketeslersetmessageid: <1572993586.01.0.941321705815.issue38707@roundup.psfhosted.org>
2019-11-05 22:39:45jaketeslerlinkissue38707 messages
2019-11-05 22:39:45jaketeslercreate