Issue34134
Created on 2018-07-17 04:03 by Windson Yang, last changed 2022-04-11 14:59 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| test.py | Windson Yang, 2018-07-17 04:03 | |||
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 8324 | merged | Windson Yang, 2018-07-18 10:03 | |
| PR 11673 | merged | miss-islington, 2019-01-25 12:02 | |
| PR 11673 | merged | miss-islington, 2019-01-25 12:02 | |
| PR 11674 | closed | miss-islington, 2019-01-25 12:04 | |
| PR 11674 | closed | miss-islington, 2019-01-25 12:04 | |
| PR 11674 | closed | miss-islington, 2019-01-25 12:04 | |
| Messages (14) | |||
|---|---|---|---|
| msg321788 - (view) | Author: Windson Yang (Windson Yang) * | Date: 2018-07-17 04:03 | |
I'm using macOX and I got huge memory usage when using generator with multiprocess. (see file) I think this is because (https://github.com/python/cpython/blob/master/Lib/multiprocessing/pool.py#L383) if not hasattr(iterable, '__len__'): iterable = list(iterable) if chunksize is None: chunksize, extra = divmod(len(iterable), len(self._pool) * 4) if extra: chunksize += 1 When we convert an iterable to list(iterable), we lost the advantage of using the generator. I'm not sure how to fix it, maybe we can set a default value for an object don't have '__len__' attr, any ideas? |
|||
| msg321791 - (view) | Author: Inada Naoki (methane) * ![]() |
Date: 2018-07-17 05:16 | |
Do you imap or imap_unorderd? They are intended for use with iterator, including generator. |
|||
| msg321796 - (view) | Author: Windson Yang (Windson Yang) * | Date: 2018-07-17 06:43 | |
Thank you for the hint, INADA. I think we should add something like "if you are using generator, consider use imap instead" in https://docs.python.org/3.4/library/multiprocessing.html?highlight=process#multiprocessing.pool.Pool.map |
|||
| msg321797 - (view) | Author: Inada Naoki (methane) * ![]() |
Date: 2018-07-17 06:56 | |
> I think we should add something like "if you are using generator, consider use imap instead" I think it's not good hint. There are short generator. And there are long (or infinite) iterator other than generator too. Maybe, "if iterator is not sequence (e.g. generator) and can be very big, consider using `imap` or `imap_unorderd` with explicit `chunksize` option for better efficiency." But I'm not good at writing English. Someone other than me can write better paragraph. |
|||
| msg321800 - (view) | Author: Windson Yang (Windson Yang) * | Date: 2018-07-17 07:30 | |
Thank you, I will try to make a pull request and let other to edit it. |
|||
| msg321836 - (view) | Author: Xiang Zhang (xiang.zhang) * ![]() |
Date: 2018-07-17 15:02 | |
One thing worth a try here maybe turn `len` to `operator.length_hint`. But I am not sure it's a good idea and just a mention here. |
|||
| msg321837 - (view) | Author: Windson Yang (Windson Yang) * | Date: 2018-07-17 15:12 | |
Thank you Xiang Zhang, I found the code keeps hanging when I use imap, I will try to figure out tomorrow. |
|||
| msg321856 - (view) | Author: Windson Yang (Windson Yang) * | Date: 2018-07-18 03:52 | |
The code didn't work with imap because imap create a generator, so we can't access result outside the with statement.
with Pool(os.cpu_count()) as p:
result = p.imap(clean_up, k, 50)
for r in result:
print(r)
In https://docs.python.org/3.4/library/multiprocessing.html?highlight=process#using-a-pool-of-workers I found the correct example. I'm not sure should me add example or warning in imap function.
|
|||
| msg321857 - (view) | Author: Xiang Zhang (xiang.zhang) * ![]() |
Date: 2018-07-18 04:29 | |
Why accessing the result outside of with block? The pool is terminated while exiting the block before the work is done. |
|||
| msg321864 - (view) | Author: Windson Yang (Windson Yang) * | Date: 2018-07-18 07:43 | |
Yes, we should not. But we can do this when use map function. the document gives a good example but doesn't say much about real differences between map and imap. Maybe we should add some notes like INADA suggest. map function will convert iterable to list if it doesn't implement __len__ function, so if you are using a generator, you should consider use imap. As well as add a warning about don't try to access the result outside the with statement. But if you guys think the docs are good enough, please close this issue. |
|||
| msg321867 - (view) | Author: Xiang Zhang (xiang.zhang) * ![]() |
Date: 2018-07-18 08:15 | |
I'm +1 for INADA's change, but not more examples trying to distinguish every detail difference. |
|||
| msg334354 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2019-01-25 12:01 | |
New changeset 3bab40db96efda2e127ef84e6501fda0cdc4f5b8 by Antoine Pitrou (Windson yang) in branch 'master': bpo-34134: Advise to use imap or imap_unordered when handling long iterables. (gh-8324) https://github.com/python/cpython/commit/3bab40db96efda2e127ef84e6501fda0cdc4f5b8 |
|||
| msg334356 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2019-01-25 12:08 | |
New changeset c2674bf11036af1e06c1be739f0eebcc72dfbf7a by Antoine Pitrou (Miss Islington (bot)) in branch '3.7': bpo-34134: Advise to use imap or imap_unordered when handling long iterables. (gh-8324) (gh-11673) https://github.com/python/cpython/commit/c2674bf11036af1e06c1be739f0eebcc72dfbf7a |
|||
| msg334357 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2019-01-25 12:17 | |
This is basically fixed, except that I'll let the Release Manager choose whether 3.6 gets the fix as well. Thanks! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:03 | admin | set | github: 78315 |
| 2019-01-25 12:17:21 | pitrou | set | status: open -> closed versions: + Python 3.7 messages: + msg334357 resolution: fixed |
| 2019-01-25 12:08:16 | pitrou | set | messages: + msg334356 |
| 2019-01-25 12:04:44 | miss-islington | set | pull_requests: + pull_request11491 |
| 2019-01-25 12:04:35 | miss-islington | set | pull_requests: + pull_request11490 |
| 2019-01-25 12:04:26 | miss-islington | set | pull_requests: + pull_request11489 |
| 2019-01-25 12:02:20 | miss-islington | set | pull_requests: + pull_request11488 |
| 2019-01-25 12:02:12 | miss-islington | set | pull_requests: + pull_request11487 |
| 2019-01-25 12:01:45 | pitrou | set | messages: + msg334354 |
| 2018-07-21 00:35:18 | terry.reedy | set | nosy:
+ pitrou |
| 2018-07-18 10:03:16 | Windson Yang | set | keywords:
+ patch stage: patch review pull_requests: + pull_request7859 |
| 2018-07-18 08:15:01 | xiang.zhang | set | messages: + msg321867 |
| 2018-07-18 07:43:49 | Windson Yang | set | messages: + msg321864 |
| 2018-07-18 04:29:03 | xiang.zhang | set | messages: + msg321857 |
| 2018-07-18 03:52:33 | Windson Yang | set | messages: + msg321856 |
| 2018-07-17 15:12:55 | Windson Yang | set | messages: + msg321837 |
| 2018-07-17 15:02:52 | xiang.zhang | set | nosy:
+ xiang.zhang messages:
+ msg321836 |
| 2018-07-17 07:30:46 | Windson Yang | set | messages: + msg321800 |
| 2018-07-17 06:56:04 | methane | set | messages: + msg321797 |
| 2018-07-17 06:43:59 | Windson Yang | set | messages: + msg321796 |
| 2018-07-17 05:16:04 | methane | set | nosy:
+ methane messages: + msg321791 |
| 2018-07-17 04:03:50 | Windson Yang | create | |
