Issue23530
Created on 2015-02-26 14:30 by jtaylor, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| 0001-Issue-23530-Update-documentation-clarify-relation-of.patch | jtaylor, 2015-03-07 11:03 | review | ||
| Messages (16) | |||
|---|---|---|---|
| msg236671 - (view) | Author: Julian Taylor (jtaylor) | Date: 2015-02-26 14:30 | |
multiprocessing.cpu_count and os.cpu_count which are often used to determine how many processes one can run in parallel do not respect the cpuset which may limit the process to only a subset of online cpus leading to heavy oversubscription in e.g. containerized environments: $ taskset -c 0 python3.4 -c 'import multiprocessing; print(multiprocessing.cpu_count())' 32 $ taskset -c 0 python3.4 -c 'import os; print(os.cpu_count())' 32 While the correct result here should be 1. This requires programs to have to use less portable methods like forking to gnu nproc or having to read the /proc filesystem. Having a keyword argument to switch between online and available cpus would be fine too. |
|||
| msg236690 - (view) | Author: Davin Potts (davin) * ![]() |
Date: 2015-02-26 17:23 | |
Detecting the number of processors available to a process is a distinct concept from reporting the number of processors present on a system. cpu_count is currently focused on the latter. Functionality to report the number of effectively-available processors is indeed valuable information to have in certain situations but it is also something very OS-dependent (I mean the support for it is not present in all OSes, let alone the complication of how to access such information differing across OSes). I think a strong case can be made that cpu_count's current functionality should not change -- the information it provides is very important to have access to. Cool tools like psutil (https://pypi.python.org/pypi/psutil) help cover a great range of genuine needs when making use of processor affinity. |
|||
| msg236695 - (view) | Author: Julian Taylor (jtaylor) | Date: 2015-02-26 17:59 | |
I do agree that its probably safer to not change the default return value. But adding a option (or new function) would still be good, the number of available cpus is more often the number you actually want in practice. To the very least the documentation should be improved to clearly state that this number does not guarantee that this amount of cpus are actually available to run on and you should use psutils instead. Code for getting this information for the major operating systems linux, bsd and windows is available in gnu coreutils. I can possibly work on a patch if it would get accepted but I can only actually test it linux. |
|||
| msg236781 - (view) | Author: Davin Potts (davin) * ![]() |
Date: 2015-02-27 16:31 | |
Adding an option does sound like a better possibility. Still, when I start looking through the examples that psutil provides, it reminds me how this is but one small piece of a much larger picture which psutil has done a nice, focused job of working to address. If the patch you create were to depend upon gnu coreutils, I do not think it can be accepted for licensing reasons. Interestingly psutil does not appear to depend upon that library. Regarding the docs, what text would you propose instead of what's currently there for describing cpu_count? |
|||
| msg236805 - (view) | Author: Julian Taylor (jtaylor) | Date: 2015-02-27 17:40 | |
certainly for anything that needs good control over affinity psutils is the best choice, but I'm not arguing to implement full process control in python. I only want python to provide the number of cores one can work on to make best use of the available resources. If you code search python files for cpu_count you find on github 18000 uses, randomly sampling a few every single one was to determine the number of cpus to start worker jobs to get best performance. Every one of these will oversubscribe a host that restricts the cpus a process can use. This is an issue especially for the increasingly popular use of containers instead of full virtual machines. as a documentation update I would like to have a note saying that this number is the number of (online) cpus in the system may not be the number of of cpus the process can actually use. Maybe with a link to len(psutils.Process.get_affinity()) as a reference on how to obtain that number. there would be no dependence on coreutils, I just mentioned it as you can look up the OS api you need to use to get the number there (e.g. sched_getaffinity). It is trivial API use and should not be a licensing issue, one could also look at the code from psutil which most likely looks very similar. |
|||
| msg236827 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2015-02-27 20:08 | |
Well, we already expose CPU affinity:
>>> import os
>>> os.sched_getaffinity(0)
{0}
IMO the current implementation is sufficient (and talking about
overcommitting for CPU is a bit moot if you're using virtual machine
anyways).
The current documentation says:
Return the number of CPUs in the system. Returns None if undetermined.
Which to me is clear enough, although if you want to add an explicit
note that this doesn't take cpu affinity into account that wouldn't
hurt.
|
|||
| msg236828 - (view) | Author: Julian Taylor (jtaylor) | Date: 2015-02-27 20:13 | |
oh thats great so python already has what I want. Then just an small documentation update would be good, I'll have a go at a patch later. |
|||
| msg236867 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2015-02-28 06:00 | |
> Well, we already expose CPU affinity:
>
> >>> import os
> >>> os.sched_getaffinity(0)
> {0}
os.sched_getaffinity only exists on some POSIX systems, such as Linux. For Windows, here's a ctypes version of sched_getaffinity and sched_setaffinity:
import sys
from ctypes import *
from ctypes.wintypes import *
__all__ = ['sched_getaffinity', 'sched_setaffinity']
kernel32 = WinDLL('kernel32')
DWORD_PTR = WPARAM
PDWORD_PTR = POINTER(DWORD_PTR)
GetCurrentProcess = kernel32.GetCurrentProcess
GetCurrentProcess.restype = HANDLE
OpenProcess = kernel32.OpenProcess
OpenProcess.restype = HANDLE
OpenProcess.argtypes = (DWORD, # dwDesiredAccess,_In_
BOOL, # bInheritHandle,_In_
DWORD) # dwProcessId, _In_
GetProcessAffinityMask = kernel32.GetProcessAffinityMask
GetProcessAffinityMask.argtypes = (
HANDLE, # hProcess, _In_
PDWORD_PTR, # lpProcessAffinityMask, _Out_
PDWORD_PTR) # lpSystemAffinityMask, _Out_
SetProcessAffinityMask = kernel32.SetProcessAffinityMask
SetProcessAffinityMask.argtypes = (
HANDLE, # hProcess, _In_
DWORD_PTR) # dwProcessAffinityMask, _In_
PROCESS_SET_INFORMATION = 0x0200
PROCESS_QUERY_INFORMATION = 0x0400
PROCESS_QUERY_LIMITED_INFORMATION = 0x1000
if sys.getwindowsversion().major < 6:
PROCESS_QUERY_LIMITED_INFORMATION = PROCESS_QUERY_INFORMATION
def sched_getaffinity(pid):
if pid == 0:
hProcess = GetCurrentProcess()
else:
hProcess = OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION,
False, pid)
if not hProcess:
raise WinError()
lpProcessAffinityMask = DWORD_PTR()
lpSystemAffinityMask = DWORD_PTR()
if not GetProcessAffinityMask(hProcess,
byref(lpProcessAffinityMask),
byref(lpSystemAffinityMask)):
raise WinError()
mask = lpProcessAffinityMask.value
return {c for c in range(sizeof(DWORD_PTR) * 8) if (1 << c) & mask}
def sched_setaffinity(pid, mask):
if pid == 0:
hProcess = GetCurrentProcess()
else:
hProcess = OpenProcess(PROCESS_SET_INFORMATION,
False, pid)
if not hProcess:
raise WinError()
bitmask = 0
for cpu in mask:
if not isinstance(cpu, int):
raise TypeError('expected an iterator of ints, but '
'iterator yielded %r' % type(cpu))
if cpu < 0:
raise ValueError('negative CPU number')
if cpu >= sizeof(DWORD_PTR) * 8:
raise ValueError('CPU number too large')
bitmask |= 1 << cpu
if not SetProcessAffinityMask(hProcess, bitmask):
raise WinError()
|
|||
| msg237436 - (view) | Author: Julian Taylor (jtaylor) | Date: 2015-03-07 11:03 | |
attached documentation update patch. |
|||
| msg246524 - (view) | Author: Julian Taylor (jtaylor) | Date: 2015-07-09 20:10 | |
any comments on the doc changes? |
|||
| msg246698 - (view) | Author: Roundup Robot (python-dev) ![]() |
Date: 2015-07-13 20:02 | |
New changeset b9e3838e9664 by Charles-François Natali in branch 'default': Issue #23530: Improve os.cpu_count() description. https://hg.python.org/cpython/rev/b9e3838e9664 |
|||
| msg246699 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2015-07-13 20:41 | |
Committed. Julian, thanks for the patch! |
|||
| msg246700 - (view) | Author: Berker Peksag (berker.peksag) * ![]() |
Date: 2015-07-13 20:57 | |
It would be nice to backport the patch to the 3.4 and 3.5 branches. |
|||
| msg246773 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2015-07-15 18:32 | |
It's just a doc improvement, I'm not convinced it really needs backporting. |
|||
| msg248503 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2015-08-13 08:12 | |
Argument Clinic code was not regenerated. Actually the commit breaks Argument Clinic. $ make clinic ./python -E ./Tools/clinic/clinic.py --make Error in file "./Modules/posixmodule.c" on line 11211: Docstring for os.cpu_count does not have a summary line! Every non-blank function docstring must start with a single line summary followed by an empty line. make: *** [clinic] Error 255 |
|||
| msg248538 - (view) | Author: Roundup Robot (python-dev) ![]() |
Date: 2015-08-13 19:37 | |
New changeset 23c6cc5d5b8f by Charles-François Natali in branch 'default': Issue #23530: fix clinic comment. https://hg.python.org/cpython/rev/23c6cc5d5b8f |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:13 | admin | set | github: 67718 |
| 2017-09-05 06:32:11 | giampaolo.rodola | set | nosy:
+ giampaolo.rodola |
| 2016-02-10 22:52:55 | neologix | set | status: open -> closed stage: needs patch -> resolved |
| 2015-08-13 19:37:30 | python-dev | set | messages: + msg248538 |
| 2015-08-13 08:12:12 | serhiy.storchaka | set | status: closed -> open nosy:
+ serhiy.storchaka resolution: fixed -> |
| 2015-07-15 18:32:33 | neologix | set | messages: + msg246773 |
| 2015-07-13 20:57:38 | berker.peksag | set | nosy:
+ berker.peksag messages: + msg246700 |
| 2015-07-13 20:41:41 | neologix | set | status: open -> closed resolution: fixed messages: + msg246699 stage: resolved |
| 2015-07-13 20:02:19 | python-dev | set | nosy:
+ python-dev messages: + msg246698 |
| 2015-07-09 20:10:08 | jtaylor | set | messages: + msg246524 |
| 2015-03-07 11:03:07 | jtaylor | set | files:
+ 0001-Issue-23530-Update-documentation-clarify-relation-of.patch keywords: + patch messages: + msg237436 |
| 2015-02-28 06:00:51 | eryksun | set | nosy:
+ eryksun messages: + msg236867 |
| 2015-02-27 20:13:42 | jtaylor | set | messages: + msg236828 |
| 2015-02-27 20:08:49 | neologix | set | messages: + msg236827 |
| 2015-02-27 17:40:44 | jtaylor | set | messages: + msg236805 |
| 2015-02-27 16:31:04 | davin | set | messages: + msg236781 |
| 2015-02-26 17:59:17 | jtaylor | set | messages: + msg236695 |
| 2015-02-26 17:23:10 | davin | set | nosy:
+ davin messages: + msg236690 |
| 2015-02-26 14:40:30 | serhiy.storchaka | set | nosy:
+ neologix |
| 2015-02-26 14:30:54 | jtaylor | create | |

