gh-119127: functools.partial placeholders by dg-pb · Pull Request #119827 · python/cpython
As I already had implementation I though PR might be helpful for others to see and evaluate.
From all the different extensions of functools.partial I think this one is the best. It is relatively simple and exposes all missing functionality. Other partial extensions that I have seen lack functionality and would not provide complete argument ordering capabilities and/or are too complicated in relation to what they offer.
Implementation can be summarised as follows:
a) Trailing placeholders are not allowed. (Makes things simpler)
b) Throws exception if not all placeholders are filled on call
c) retains optimization benefits of application on other partial instances.
Performance penalty compared to current functools.partial is minimal for extension class. + 20-30 ns for initialisation and <4 ns when called with or without placeholders.
To put it simply, new functionality extends functools.partial so that it has flexibility of lambda / def approach (in terms of argument ordering), but call overhead is 2x smaller.
The way I see it is that this could only be justified if this extension provided completeness and no new functionality is going to be needed anywhere near in the future. I have thought about it and tried various alternatives and I think there is a good chance that this is the case. Personally, I don't think I would ever need anything more from partial class.
Current implementation functions reliably.
Benchmark
There is nothing new here in terms of performance. The performance after this PR will be (almost) the same as the performance of partial until now. Placeholders only provide flexibility for taking advantage of performance benefits where it is important.
So far I have identified 2 such cases:
- More flexible predicate construction for functions in
operatormodule. This allows for new strategies in making performantiteratorrecipes. Partializinginput target function. Examples of this are optimizers and similar. I.e. cases where the function will be called over and over within the routine with number of arguments. But the input target function needs partial substitution for positionals and keywords.
Good example of this is scipy.optimize.minimize.
Its signature is: scipy.optimize.minimize(fun, x0, args=(), ...)
Note, it does not have kwds. Why? I don't know. But good reason for it could be:
fun = lambda x: f(x, **kwds)
will need to expand **kwds on every call (even if it is empty), while partial will make the most optimal call. (see benchmarks below). So the minimize function can leave out kwds given there is a good way to source callable with already substituted keywords.
This extension allows pre-substituting both positionals and keywords. This allows optimizer signature to leave out both kwds and args resulting in simpler interface scipy.optimize.minimize(fun, x0, ...) and gaining slightly better performance - function calls are at the center of such problems after all.
Benchmark Results for __call__
Code for Cases
dct = {'a': 1} kwds = {'c': 1, 'd': 2} kwds_empty = {} args1 = (1,) args3 = (1, 2, 4) opr_sub = opr.sub opr_contains = opr.contains opr_sub_lambda = lambda b: opr_sub(1, b) opr_sub_partial = ftl.partial(opr_sub, 1) opr_contains_lambda = lambda b: opr_contains(dct, b) opr_contains_partial = ftl.partial(opr_contains, dct) def pos2(a, b): pass def pos6(a, b, c, d, e, f): pass def pos2kw2(a, b, c=1, d=2): pass pos2_lambda = lambda b: pos2(1, b) pos2_partial = ftl.partial(pos2, 1) pos6_lambda = lambda b, c, d: pos6(1, 2, 3, b, c, d) pos6_partial = ftl.partial(pos6, 1, 2, 3) pos2kw2_kw_lambda = lambda b: pos2kw2(1, b, **kwds) pos2kw2_kw_partial = ftl.partial(pos2kw2, 1, **kwds) pos2kw2_kwe_lambda = lambda b: pos2kw2(1, b, **kwds_empty) pos2kw2_kwe_partial = ftl.partial(pos2kw2, 1, **kwds_empty) opr_sub_partial_ph = ftl.partial(opr_sub, PH, 1) opr_contains_partial_ph = ftl.partial(opr_contains, PH, 'a') pos2_partial_ph = ftl.partial(pos2, PH, 1) pos6_partial_ph = ftl.partial(pos6, PH, 2, PH, 4, PH, 6) pos2kw2_kw_partial_ph = ftl.partial(pos2kw2, PH, 1, **kwds) pos2kw2_kwe_partial_ph = ftl.partial(pos2kw2, PH, 1, **kwds_empty) # Placeholder versions from functools import Placeholder as PH opr_sub_partial_ph = ftl.partial(opr_sub, PH, 1) opr_contains_partial_ph = ftl.partial(opr_contains, PH, 'a') pos2_partial_ph = ftl.partial(pos2, PH, 1) pos6_partial_ph = ftl.partial(pos6, PH, 2, PH, 4, PH, 6) pos2kw2_kw_partial_ph = ftl.partial(pos2kw2, PH, 1, **kwds) pos2kw2_kwe_partial_ph = ftl.partial(pos2kw2, PH, 1, **kwds_empty)
CPython Results
C Implementation
----------------
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ 5 repeats, 1,000,000 times ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ Units: ns lambda partial ┃
┃ ┏━━━━━━━━━━━━━━━━━━━┫
┃ opr_sub ┃ 50 ± 4 40 ± 2 ┃
┃ opr_contains ┃ 53 ± 3 43 ± 3 ┃
┃ pos2 ┃ 50 ± 1 64 ± 1 ┃
┃ pos2(*args1) ┃ 69 ± 5 73 ± 5 ┃
┃ pos6 ┃ 58 ± 1 103 ± 5 ┃
┃ pos6(*args3) ┃ 77 ± 3 99 ± 5 ┃
┃ pos2kw2_kw ┃ 240 ± 4 259 ± 7 ┃
┃ pos2kw2_kwe ┃ 134 ± 6 69 ± 3 ┃
┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━┛
With Placeholders
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ 5 repeats, 1,000,000 times ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ Units: ns lambda partial Placeholders ┃
┃ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ opr_sub ┃ 50 ± 2 39 ± 1 44 ± 4 ┃
┃ opr_contains ┃ 61 ± 2 44 ± 2 49 ± 2 ┃
┃ pos2 ┃ 54 ± 2 58 ± 3 64 ± 2 ┃
┃ pos2(*args1) ┃ 67 ± 3 72 ± 9 69 ± 3 ┃
┃ pos6 ┃ 63 ± 3 102 ± 3 99 ± 2 ┃
┃ pos6(*args3) ┃ 75 ± 3 101 ± 2 94 ± 4 ┃
┃ pos2kw2_kw ┃ 242 ± 7 259 ± 10 260 ± 7 ┃
┃ pos2kw2_kwe ┃ 131 ± 4 64 ± 1 69 ± 2 ┃
┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
Python Implementation
---------------------
Current
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ 5 repeats, 1,000,000 times ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ Units: ns lambda partial ┃
┃ ┏━━━━━━━━━━━━━━━━━━━━━┫
┃ opr_sub ┃ 48 ± 1 373 ± 13 ┃
┃ opr_contains ┃ 51 ± 1 377 ± 12 ┃
┃ pos2 ┃ 51 ± 4 378 ± 5 ┃
┃ pos2(*args1) ┃ 63 ± 5 354 ± 7 ┃
┃ pos6 ┃ 59 ± 1 437 ± 5 ┃
┃ pos6(*args3) ┃ 75 ± 2 410 ± 7 ┃
┃ pos2kw2_kw ┃ 239 ± 4 517 ± 5 ┃
┃ pos2kw2_kwe ┃ 133 ± 3 408 ± 49 ┃
┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━┛
With Placeholders
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ 5 repeats, 1,000,000 times ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ Units: ns lambda partial Placeholders ┃
┃ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ opr_sub ┃ 49 ± 1 392 ± 13 547 ± 12 ┃
┃ opr_contains ┃ 54 ± 2 393 ± 9 605 ± 78 ┃
┃ pos2 ┃ 55 ± 9 398 ± 7 544 ± 5 ┃
┃ pos2(*args1) ┃ 66 ± 2 373 ± 5 533 ± 8 ┃
┃ pos6 ┃ 58 ± 5 462 ± 4 652 ± 3 ┃
┃ pos6(*args3) ┃ 74 ± 2 428 ± 11 635 ± 9 ┃
┃ pos2kw2_kw ┃ 240 ± 5 533 ± 4 696 ± 10 ┃
┃ pos2kw2_kwe ┃ 134 ± 2 406 ± 4 555 ± 3 ┃
┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
PyPy Results
PyPy
----
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ 5 repeats, 10,000 times ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ Units: ns lambda partial ┃
┃ ┏━━━━━━━━━━━━━━━━━━━━━┫
┃ opr_sub ┃ 122 ± 15 266 ± 70 ┃
┃ opr_contains ┃ 147 ± 7 248 ± 64 ┃
┃ pos2 ┃ 114 ± 17 204 ± 49 ┃
┃ pos2(*args1) ┃ 156 ± 24 202 ± 28 ┃
┃ pos6 ┃ 124 ± 14 268 ± 39 ┃
┃ pos6(*args3) ┃ 147 ± 36 225 ± 21 ┃
┃ pos2kw2_kw ┃ 259 ± 17 436 ± 66 ┃
┃ pos2kw2_kwe ┃ 180 ± 14 243 ± 43 ┃
┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━┛
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ 5 repeats, 1,000,000 times ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ Units: ns lambda partial ┃
┃ ┏━━━━━━━━━━━━━━━━━━┫
┃ opr_sub ┃ 1 ± 0 3 ± 1 ┃
┃ opr_contains ┃ 13 ± 0 16 ± 2 ┃
┃ pos2 ┃ 1 ± 0 3 ± 1 ┃
┃ pos2(*args1) ┃ 2 ± 0 2 ± 0 ┃
┃ pos6 ┃ 1 ± 0 2 ± 0 ┃
┃ pos6(*args3) ┃ 2 ± 0 2 ± 0 ┃
┃ pos2kw2_kw ┃ 42 ± 1 72 ± 2 ┃
┃ pos2kw2_kwe ┃ 2 ± 0 2 ± 0 ┃
┗━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━┛
PyPy Placeholder
----------------
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ 5 repeats, 10,000 times ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ Units: ns lambda partial Placeholders ┃
┃ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ opr_sub ┃ 114 ± 5 256 ± 82 719 ± 170 ┃
┃ opr_contains ┃ 142 ± 7 538 ± 536 787 ± 145 ┃
┃ pos2 ┃ 125 ± 19 239 ± 54 679 ± 116 ┃
┃ pos2(*args1) ┃ 130 ± 30 199 ± 17 638 ± 48 ┃
┃ pos6 ┃ 115 ± 16 237 ± 43 785 ± 176 ┃
┃ pos6(*args3) ┃ 138 ± 25 214 ± 14 703 ± 19 ┃
┃ pos2kw2_kw ┃ 260 ± 24 382 ± 67 850 ± 92 ┃
┃ pos2kw2_kwe ┃ 179 ± 28 223 ± 44 661 ± 32 ┃
┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ 5 repeats, 1,000,000 times ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ Units: ns lambda partial Placeholders ┃
┃ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ opr_sub ┃ 1 ± 0 3 ± 1 156 ± 4 ┃
┃ opr_contains ┃ 13 ± 0 15 ± 1 173 ± 3 ┃
┃ pos2 ┃ 2 ± 0 3 ± 1 154 ± 7 ┃
┃ pos2(*args1) ┃ 2 ± 0 2 ± 0 148 ± 3 ┃
┃ pos6 ┃ 2 ± 0 3 ± 1 200 ± 2 ┃
┃ pos6(*args3) ┃ 2 ± 0 3 ± 0 217 ± 39 ┃
┃ pos2kw2_kw ┃ 43 ± 1 71 ± 1 240 ± 2 ┃
┃ pos2kw2_kwe ┃ 2 ± 0 2 ± 0 149 ± 2 ┃
┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
Setup:
- First 2 columns are identical calls - one using
lambdaotherpartial. - 3rd column is using placeholder to expose 1st argument as opposed to 2nd (or different places for 6-arg case).
CPython:
- There is negligible impact on
__call__. Run times are very close of current and new version with Placeholders. - It can be seen that run times are not impacted by placeholder usage in any significant way.
pos2kw2_kwe(emptykwds) is much faster ofpartialcall.pos2kw2_kw(non-emptykwds) is currently slower, however gh-119109: functool.partial vectorcall supports pto->kw & fallback to tp_call removed #120783 will likely to improve its speed so that it outperforms lambda.
PyPy:
- Usage of
Placeholdersresults in very poor performance. However, this has no material implication aslambdais more performant thanpartialin all cases and is an optimal choice.
Benchmark Results for __new__
INIT='import functools as ftl; g = lambda a, b, c, d, e, f: (a, b, c, d, e, f);' # CURRENT $PY_MAIN -m timeit -s $INIT 'ftl.partial(g, 0, 1, 2)' # 160 # PLACEHOLDERS INIT2="$INIT PH=ftl.Placeholder;" $PY_MAIN -m timeit -s $INIT2 'ftl.partial(g, 0, 1, 2)' # 170 $PY_MAIN -m timeit -s $INIT2 'ftl.partial(g, 0, 1, 2, 3, 4, 5)' # 190 $PY_MAIN -m timeit -s $INIT2 'ftl.partial(g, PH, 1, PH, 3, PH, 5)' # 200
- There is small performance decrease for initialization without placeholders.
- Initializing it with placeholders is slower for the same number of arguments (excluding placeholders).
- But it is not much slower if placeholders are counted as arguments.
To sum up
This extension:
- allows extracting current performance benefits of
partialto few more important (at least from my POV) cases. - seems to allow for certain simplifications to happen by bringing it more in line with
lambda/defbehaviour. Thus, allowingpartialto be used forpartialmethodapplication which allows for some simplifications in handling these in other parts of the library - i.e.inspect.
📚 Documentation preview 📚: https://cpython-previews--119827.org.readthedocs.build/