[Python-ideas] PEP 3156 feedback: wait_one vs par vs concurrent.futures.wait

Guido van Rossum guido at python.org
Sat Dec 22 07:20:12 CET 2012
Previous message: [Python-ideas] PEP 3156 feedback: wait_one vs par vs concurrent.futures.wait
Next message: [Python-ideas] PEP 3156 feedback: wait_one vs par vs concurrent.futures.wait
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, Dec 21, 2012 at 9:17 PM, Guido van Rossum <guido at python.org> wrote:
> On Fri, Dec 21, 2012 at 8:46 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> I figure python-ideas is still the best place for PEP 3156 feedback -
>> I think it's being revised too heavily for in-depth discussion on
>> python-dev to be a good idea, and I think spinning out a separate list
>> would lose too many people that are
>> interested-but-not-enough-to-subscribe-to-yet-another-mailing-list
>> (including me).
>>
>> The current draft of the PEP suggests the use of par() for the barrier
>> operation (waiting for all futures and coroutines in a collection to
>> be ready), while tentatively suggesting wait_one() as the API for
>> waiting for the first completed operation in a collection. That
>> inconsistency is questionable all by itself, but there's a greater
>> stdlib level inconsistency that I find more concerning
>>
>> The corresponding blocking API in concurrent.futures is the module
>> level "wait" function, which accepts a "return_when" parameter, with
>> the permitted values FIRST_COMPLETED, FIRST_EXCEPTION and
>> ALL_COMPLETED (the default). In the case where everything succeeds,
>> FIRST_EXCEPTION is the same as ALL_COMPLETED. This function also
>> accepts a timeout which allows the operation to finish early if the
>> operations take too long.
>>
>> This flexibility also leads to a difference in the structure of the
>> return type: concurrent.futures.wait always returns a pair of sets,
>> with the first set being those futures which completed, while the
>> second contains those which remaining incomplete at the time the call
>> returned.
>>
>> It seems to me that this "wait" API can be applied directly to the
>> equivalent problems in the async space, and, accordingly, *should* be
>> applied so that the synchronous and asynchronous APIs remain as
>> consistent as possible.
>
> You've convinced me. I've never used the wait() and as_completed()
> APIs in c.f, but you're right that with the exception of requiring
> 'yield from' they can be carried over exactly, and given that we're
> doing the same thing with Future, this is eminently reasonable.
>
> I may not get to implementing these for two weeks (I'll be traveling
> without a computer) but they will not be forgotten.

I did update the PEP. There are some questions about details; e.g. I
think the 'fs' argument should allow a mixture of Futures and
coroutines (the latter will be wrapped Tasks) and the sets returned by
wait() should contain Futures and Tasks. You propose that
as_completed() returns an iterator whose items are coroutines; why not
Futures? (They're more versatile even if slightly slower that
coroutines.) I can sort of see the reasoning but want to tease out
whether you meant it that way. Also, we can't have __next__() raise
TimeoutError, since it never blocks; it will have to be the coroutine
(or Future) returned by __next__().

> --Guido
>
>> The low level equivalent to par() would be:
>>
>>     incomplete = <tasks, futures or coroutines>
>>     complete, incomplete = yield from tulip.wait(incomplete)
>>     assert not incomplete # Without a timeout, everything should complete
>>     for f in complete:
>>         # Handle the completed operations
>>
>> Limiting the maximum execution time of any task to 10 seconds is
>> straightforward:
>>
>>     incomplete = <tasks, futures or coroutines>
>>     complete, incomplete = yield from tulip.wait(incomplete, timeout=10)
>>     for f in incomplete:
>>         f.cancel() # Took too long, kill it
>>     for f in complete:
>>         # Handle the completed operations
>>
>> The low level equivalent to the wait_one() example would become:
>>
>>     incomplete = <tasks, futures or coroutines>
>>     while incomplete:
>>         complete, incomplete = yield from tulip.wait(incomplete,
>> return_when=FIRST_COMPLETED)
>>         for f in complete:
>>             # Handle the completed operations
>>
>> par() becomes easy to define as a coroutine:
>>
>>     @coroutine
>>     def par(fs):
>>         complete, incomplete = yield from tulip.wait(fs,
>> return_when=FIRST_EXCEPTION)
>>         for f in incomplete:
>>             f.cancel() # Something must have failed, so cancel the rest
>>         # If something failed, calling f.result() will raise that exception
>>         return [f.result() for f in complete]
>>
>> Defining wait_one() is also straightforward (although it isn't clearly
>> superior to just
>> using the underlying API directly):
>>
>>     @coroutine
>>     def wait_one(fs):
>>         complete, incomplete = yield from tulip.wait(fs,
>> return_when=FIRST_COMPLETED)
>>         return complete.pop()
>>
>> The async equivalent to "as_completed" under this scheme is far more
>> interesting, as it would be an iterator that produces coroutines:
>>
>>     def as_completed(fs):
>>         incomplete = fs
>>         while incomplete:
>>             # Phase 1 of the loop, we yield a coroutine that actually
>> starts operations running
>>             @coroutine
>>             def _wait_for_some():
>>                 nonlocal complete, incomplete
>>                 complete, incomplete = yield from tulip.wait(fs,
>> return_when=FIRST_COMPLETED)
>>                 return complete.pop().result()
>>             yield _wait_for_some()
>>             # Phase 2 of the loop, we pass back the already complete operations
>>             while complete:
>>                 # Note this use case for @coroutine *forcing* objects
>> to behave like a generator,
>>                 # as well as exploiting the ability to avoid trips
>> around the event loop
>>                 @coroutine
>>                 def _next_result():
>>                     return complete.pop().result()
>>                 yield _next_result()
>>
>>     # This is almost as easy to use as the synchronous equivalent, the
>> only difference
>>     # is the use of "yield from f" instead of the synchronous "f.result()"
>>     for f in as_completed(fs):
>>         next = yield from f
>>
>> Cheers,
>> Nick.
>>
>> --
>> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>
> --
> --Guido van Rossum (python.org/~guido)



-- 
--Guido van Rossum (python.org/~guido)
Previous message: [Python-ideas] PEP 3156 feedback: wait_one vs par vs concurrent.futures.wait
Next message: [Python-ideas] PEP 3156 feedback: wait_one vs par vs concurrent.futures.wait
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-ideas mailing list