Issue29202
Created on 2017-01-08 09:50 by rhettinger, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| dict_shorter_iteration.diff | rhettinger, 2017-01-08 09:50 | First few cases | ||
| dict_shorter_iteration.diff | serhiy.storchaka, 2017-01-08 12:38 | Regenerated for review | review | |
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 11900 | merged | cheryl.sabella, 2019-02-16 21:27 | |
| Messages (11) | |||
|---|---|---|---|
| msg284970 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2017-01-08 09:50 | |
I haven't had much of a chance to comb through all the code it detail, but there are least a few places that loop over the whole entry table (size is dk_nentries) when fewer iterations would suffice (stop once mp->ma_used entries have been seen). Since the table is usually dense packed to the left, this will give fewer loops and fewer memory accesses. |
|||
| msg284974 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2017-01-08 10:40 | |
I'm wondering why new dict implementation don't keep the array of items compact? Original Raymond's implementation did not preserve the order after deletions, but saved items in an array without gaps. This could simplify and speed up an iteration (no need to check values for NULL, and needed to iterate fewer elements), and could get rid of reallocations in often mutated dicts. I haven't found clear explanation of this. |
|||
| msg284988 - (view) | Author: Inada Naoki (methane) * ![]() |
Date: 2017-01-08 12:24 | |
> I'm wondering why new dict implementation don't keep the array of items compact? Original Raymond's implementation did not preserve the order after deletions, but saved items in an array without gaps. This could simplify and speed up an iteration (no need to check values for NULL, and needed to iterate fewer elements), and could get rid of reallocations in often mutated dicts. I haven't found clear explanation of this. As far as I remember, I choose it because: 1. easy to explain, easy to discussion. "keep insertion order" is easier than "keep insertion order unless deletion". I want to start discussion based on most simple behavior. But my patch was reviewed by core developers in sprint, when right before 3.6b1. 2. I want to share same implementation with OrderedDict, like PyPy. 3. Make patch compact. If we compact dk_entries when deletion, we need another dk_indices compaction. Before sprint, my patch wasn't reviewed for months. I was afraid that my patch wasn't reviewed by 3.6b1. For now, "namespace dict keeps insertion order" is language spec. `del` shouldn't break insertion order. |
|||
| msg284989 - (view) | Author: Inada Naoki (methane) * ![]() |
Date: 2017-01-08 12:31 | |
patch LGTM. > Since the table is usually dense packed to the left, this will give fewer loops and fewer memory accesses. Since dk_nentries is "right end of active entries", this patch doesn't affect for "dence packed to the left" dict. But it can affect when dk_entries is sparce, and there are many deleted entries on the right end. |
|||
| msg284990 - (view) | Author: Inada Naoki (methane) * ![]() |
Date: 2017-01-08 12:45 | |
While this isn't bugfix, I'm +1 to commit this to 3.6 branch to reduce future maintenance cost. |
|||
| msg284991 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2017-01-08 12:55 | |
In general the idea LGTM. It it is slightly dangerous if dict structure is broken. j is incremented only when value != NULL, and if ma_used is not correct, this can cause reading out of the end of an array. Of course this should never happen. But if there is a chance to add some assertions or additional checks in debug mode, it would be nice. Unless this complicate the code to much. The i counter is not used in some loops. It can be eliminated. |
|||
| msg304824 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2017-10-23 17:06 | |
Do you mind to create a pull request Raymond? |
|||
| msg304855 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2017-10-24 03:20 | |
This doesn't meet our criteria for backports. |
|||
| msg335713 - (view) | Author: Cheryl Sabella (cheryl.sabella) * ![]() |
Date: 2019-02-16 20:13 | |
I can convert this to a pull request if there is still interest in the change. |
|||
| msg335716 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2019-02-16 20:46 | |
> I can convert this to a pull request if there is still interest in the change. Yes, please do. |
|||
| msg339488 - (view) | Author: Inada Naoki (methane) * ![]() |
Date: 2019-04-05 10:08 | |
New changeset f66e336f455b5a6bb0ca857d61c43be410d0df13 by Inada Naoki (Cheryl Sabella) in branch 'master': bpo-29202: improve dict iteration (GH-11900) https://github.com/python/cpython/commit/f66e336f455b5a6bb0ca857d61c43be410d0df13 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:41 | admin | set | github: 73388 |
| 2019-04-05 10:09:00 | methane | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2019-04-05 10:08:52 | methane | set | messages: + msg339488 |
| 2019-02-16 21:27:18 | cheryl.sabella | set | pull_requests: + pull_request11927 |
| 2019-02-16 20:46:14 | rhettinger | set | messages: + msg335716 |
| 2019-02-16 20:13:01 | cheryl.sabella | set | nosy:
+ cheryl.sabella messages:
+ msg335713 |
| 2017-10-24 03:20:17 | rhettinger | set | assignee: rhettinger -> serhiy.storchaka messages: + msg304855 versions: - Python 3.6 |
| 2017-10-23 17:06:34 | serhiy.storchaka | set | messages: + msg304824 |
| 2017-01-08 12:55:55 | serhiy.storchaka | set | assignee: serhiy.storchaka -> rhettinger messages: + msg284991 stage: patch review |
| 2017-01-08 12:45:02 | methane | set | messages: + msg284990 |
| 2017-01-08 12:38:41 | serhiy.storchaka | set | files: + dict_shorter_iteration.diff |
| 2017-01-08 12:31:45 | methane | set | messages: + msg284989 |
| 2017-01-08 12:24:20 | methane | set | messages: + msg284988 |
| 2017-01-08 10:40:07 | serhiy.storchaka | set | messages: + msg284974 |
| 2017-01-08 09:50:48 | rhettinger | set | priority: normal -> low |
| 2017-01-08 09:50:36 | rhettinger | create | |
