Issue26692
Created on 2016-04-05 00:46 by Satrajit Ghosh, last changed 2022-04-11 14:58 by admin.
| Messages (4) | |||
|---|---|---|---|
| msg262881 - (view) | Author: Satrajit Ghosh (Satrajit Ghosh) | Date: 2016-04-05 00:46 | |
multiprocessing cpucount returns the number of cpus on the system as returned by /proc/cpuinfo. this is true even on machines where linux kernel cgroups is being used to restrict cpu usage for a given process. this results in significant thread swithcing on systems with many cores. some ideas have been implemented in the following repos to handle cgroups: https://github.com/peo3/cgroup-utils http://cpachecker.googlecode.com/svn-history/r12889/trunk/scripts/benchmark/runexecutor.py it would be nice if multiprocessing was a little more intelligent and queried process characteristics. |
|||
| msg298893 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2017-07-23 07:56 | |
I'm not convinced. The reason is that using the number of CPU cores is just a heuristic for a *default value*: the API allows the user to specify the number of workers to use, so it's not really a limitation. The problem is that if you try to think about a more "correct" default value, it gets complicated: here, it's about cgroups, but for example: - What if they are multiple processes running on the same box? - What if the process is subject to CPU affinity? Currently, the CPU affinity mask is ignored. - What if the code being executed by children is itself multi-threaded (maybe because it's using a numerical library using BLAS etc)? - What about hyper-threading? If the code has a lot of cache misses, it would probably be a good idea to use one worker per logical thread, but if it's cache-friendly, probably not. - Etc. In other words, I think that there's simply not reasonable default value for the number of workers to use, that any value will make some class of users/use-case unhappy, and it would add a lot of unnecessary complexity. Since the user can always specify the number of workers - if you find a place where it's not possible, then please report it - I really think we should let the choice/burden up to the user. |
|||
| msg298901 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2017-07-23 12:22 | |
Agreed that it is not possible for multiprocessing to choose an optimal default in all settings. However, making the default adequate for more use cases sounds like a reasonable goal. Currently, we are using `os.cpu_count()`. Ideally, we would have a second API `os.usable_cpu_count()` that would return the number of logical CPUs usable by the current process (taking into account affinity settings, cgroups, etc.). |
|||
| msg310113 - (view) | Author: David Chin (hairygristle) | Date: 2018-01-16 20:06 | |
I would like to state strong support if is.get_usable_cpu_count() I administer a typical HPC cluster which may have multiple jobs scheduled on the same physical server. The fact that multiprocessing ignores cgroups leads to bad oversubscription. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:29 | admin | set | github: 70879 |
| 2018-01-16 20:06:33 | hairygristle | set | nosy:
+ hairygristle messages: + msg310113 |
| 2017-11-07 23:02:08 | mihaic | set | nosy:
+ mihaic |
| 2017-09-05 03:31:34 | giampaolo.rodola | set | nosy:
+ giampaolo.rodola |
| 2017-07-23 12:22:53 | pitrou | set | messages: + msg298901 |
| 2017-07-23 07:56:19 | neologix | set | messages: + msg298893 |
| 2017-07-22 21:59:53 | pitrou | set | stage: needs patch type: behavior -> enhancement versions: + Python 3.7, - Python 3.6 |
| 2017-07-22 21:59:46 | pitrou | set | nosy:
+ pitrou, neologix |
| 2016-04-05 06:18:39 | SilentGhost | set | nosy:
+ jnoller, sbt versions: + Python 3.6 |
| 2016-04-05 00:46:11 | Satrajit Ghosh | create | |
