I agree that an heuristic is needed to decide when a dict should be compacted.
> * When (dict size < dk_size/8), call insertion_resize()
In bpo-31179, I suggested to Yury to use 2/3 ratio... to avoid integer overflow :-) He first used 80%, but I dislike using the FPU in the dictobject.c. I'm not sure of the cost of switching from integers to floats, and more generally I hate rounding issues, so I prefer to use regular integers ;-)
+ (3) if 'mp' is non-compact ('del' operation does not resize dicts),
+ do fast-copy only if it has at most 1/3 non-used keys.
+
+ The last condition (3) is important to guard against a pathalogical
+ case when a large dict is almost emptied with multiple del/pop
+ operations and copied after that. In cases like this, we defer to
+ PyDict_Merge, which produces a compacted copy.
By the way, if dict automatically compacts itself automatically, do we still need Yury's test "is the dict compact"? |