Support sized pages in Page Streaming
Early feedback suggests that in addition to simple iteration, a batch mode of operation should also be supported for page-streamed calls, where the user is able to iterate by batches of results. E.g,
E.g, currently for page streaming the following works
>>> from google.pubsub.v1.publisher_api import PublisherApi >>> api = PublisherApi() >>> topic = api.topic_path('google.com:pubsub-demo', 'my-topic') >>> all_subs = list(api.list_topic_subscriptions(topic)) >>> all_subs [‘sub_1’, ‘sub_2’, ‘sub_3’, ‘sub_2’]
There should be a way to support batching:
>>> from google.pubsub.v1.publisher_api import PublisherApi >>> api = PublisherApi() >>> topic = api.topic_path('google.com:pubsub-demo', 'my-topic') >>> all_subs = list(??? with batchsize=2) >>> all_subs [(‘sub_1’, ‘sub_2’), (‘sub_3’, ‘sub_2’)]
Proposals
- Add a helper to google.gax to support batch iteration. Users will import it and use it when they need it.
- Add a helper to google.gax as in 1. Also, add a field to CallOptions, batch_size which users set to batch in pages
Add a helper to google.gax to support batch iteration
>>> from google.pubsub.v1.publisher_api import PublisherApi >>> from google.gax import batch_iter >>> api = PublisherApi() >>> topic, batch_size = api.topic_path('google.com:pubsub-demo', 'my-topic'), 2 >>> all_subs = list(batch_iter(api.list_topic_subscriptions(topic), batch_size)) >>> all_subs [(‘sub_1’, ‘sub_2’), (‘sub_3’, ‘sub_2’)]
Pros
- the generated code stays simple, there are no changes required to the generator
- the new feature batch_iter is not difficult, and is reminiscent of the helper methods in the itertools library
Cons
- users have more of the gax-python surface to learn; the batch_iter func is a new feature to be learned.
Add a helper to google.gax, also add batch_size to CallOptions
>>> from google.pubsub.v1.publisher_api import PublisherApi >>> from google.gax import batch_iter >>> api = PublisherApi() >>> topic, batch_size = api.topic_path('google.com:pubsub-demo', 'my-topic'), 2 >>> all_subs = list(api.list_topic_subscriptions(topic, ... options=CallOptions(batch_size=batch_size))) >>> all_subs [(‘sub_1’, ‘sub_2’), (‘sub_3’, ‘sub_2’)]
Pros
- keeps the change to the surface elements the user needs to know about to a minimum; i.e to use this feature, the user only needs to know about an addition field that we will document in CallOptions
Cons
- the generated code for methods that support page-streaming is slightly more complex
@jmuk, @geigerj, @bjwatson, @anthmgoogle PTAL and discuss
Decision
- the surface will look like this
>>> from google.pubsub.v1.publisher_api import PublisherApi >>> api = PublisherApi() >>> api.topic_path('google.com:pubsub-demo', 'my-topic') >>> all_subs = list(api.list_topic_subscriptions(topic, ... options=CallOptions(is_page_streaming=False))) >>> all_subs # whatever the server page size is [(‘sub_1’, ‘sub_2’), (‘sub_3’), ('sub_4')]
There's an open question that is OK to resolve after implementation begins