[BEAM-14166] Performance improvements for RowWithGetter by mosche · Pull Request #17172 · apache/beam
Conversation
Push field type based logic in RowWithGetters down into FieldValueGetters to remove any branching and allow for better inlining and switch to TreeMap for caching to minimize memory footprint. See benchmark results below.
To not modify the behavior of getValues(), related calls are delegated to the original getter as is.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
- Choose reviewer(s) and mention them in a comment (
R: @username). - Format the pull request title like
[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - Update
CHANGES.mdwith noteworthy changes. - If this contribution is large, please file an Apache Individual Contributor License Agreement.
See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.
By default, getValue results in generated bytecode to fetch the value. Is this cache really more efficient than the generated bytecode?
By default, getValue results in generated bytecode to fetch the value. Is this cache really more efficient than the generated bytecode?
Maybe some microbenchmarks could help here
@reuvenlax certainly a valid question and I'm happy to discuss what's worth caching.
Though, the key point here is, that there's already a cache in place for array, iterable and map types. Processing for these is currently not ideal. For instance, looking at an array field the following steps happen on getValue:
- Invoke getter, the generated byte code wraps the array in a list
- Then lazily transform the list elements (
Lists.transform) and cache the result
The problem is really on the 2nd access. Step 1) is repeated, but step 2) uses the cache discarding all the work of step 1).
This PR mostly addresses some of the inconsistencies using the existing cache and reduces the overhead of the cache by choosing better fitted data structures to store cached values.
Considering that elements of ARRAY types and ITERABLE types are lazily transformed, the value of caching is rather small (questionable?) as it happens on demand.
For MAP types it's a different story as the entire map gets rewritten on transform.
And if a cache is used, then any composite ROW type should be cached. Otherwise, if re-created every time, the cache is obviously pointless.
Benchmarks of this code are tricky to do, since the cost of codegen tends to dominate microbenchmarks.
I've pushed my benchmark code for reference, let me know if you have any suggestions. To prevent the issue you've mentioned @reuvenlax, I'm setting up a relatively large number of rows as JMH state before the actual benchmark invocation.
To establish a baseline, I'm looking at master first. Here are some initial results with some minimal changes to RowWithGetters to make these benchmarks meaningful:
- Disable caching for the first run.
- Change initialisation of the cache data structure to lazy init so associated costs are considered in the benchmark.
These numbers are certainly not very much in favour of the status quo.
I'm still iterating on improvements, but from what I've seen so far far there's lots that can be done.
…reeMap to reduce memory footprint of field value cache.
@TheNeuralBit @reuvenlax Please let me know how / if you wanna proceed here, so I can wrap up my initial PR #16947 accordingly. In any case, it was a very interesting dive into Beam Schemas. Thanks for all the pointers.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mosche for the extensive benchmarking, and apologies for the atrocious review latency!
I'm ok with this if it passes current tests, given the extensive benchmarking. A couple of things:
- I'd still like to get some final feedback on the approach from @reuvenlax, but I acn go ahead and merge next Monday if that doesn't happen this week.
- It could be interesting to run your benchmarks continuously to 1) monitor for regressions, and 2) give us a convenient way to test other potential improvements. Maybe file a jira for that if you don't have time now?
| @Nullable | ||
| ValueT get(ObjectT object); | ||
|
|
||
| default @Nullable Object getRaw(ObjectT object) { |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe getUntyped is a more appropriate name here?
Also could you clarify why we need this?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for having a look @TheNeuralBit 🙏
getRaw() was based on a conversation with @reuvenlax.
getValues() is maybe poorly named - might be better called getRawValues. What you're looking for is probably the getBaseValues() method.
getValues is mostly used in code that knows exactly what it's doing for optimization purposes. It goes along with the attachValues method, which is similarly tricky to use. It's there to enable 0-copy code, but not necessarily intended for general consumption.
RowWithGetters.getValues() returns the "raw" unmodified result of the getters:
public List<Object> getValues() { return getters.stream().map(g -> g.get(getterTarget)).collect(Collectors.toList()); }
As I am pushing down the transformation of the getter result into the getter itself, I needed a way to bypass that in order to maintain the current semantics of getValues(). Let me know if the name makes sense given that context.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense, thanks. Could you add some of this context in a comment there?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheNeuralBit I've opened a new PR to add the missing comment, sorry for the delay.
#21982
Let me take another look. IIRC I had some concerns, but it's possible that the extensive benchmarking addressed those.
…On Wed, May 11, 2022 at 2:58 PM Brian Hulette ***@***.***> wrote: ***@***.**** approved this pull request. Thanks @mosche <https://github.com/mosche> for the extensive benchmarking, and apologies for the atrocious review latency! I'm ok with this if it passes current tests, given the extensive benchmarking. A couple of things: - I'd still like to get some final feedback on the approach from @reuvenlax <https://github.com/reuvenlax>, but I acn go ahead and merge next Monday if that doesn't happen this week. - It could be interesting to run your benchmarks continuously to 1) monitor for regressions, and 2) give us a convenient way to test other potential improvements. Maybe file a jira for that if you don't have time now? ------------------------------ In sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldValueGetter.java <#17172 (comment)>: > @@ -33,5 +33,9 @@ @nullable ValueT get(ObjectT object); + default @nullable Object getRaw(ObjectT object) { nit: maybe getUntyped is a more appropriate name here? Also could you clarify why we need this? — Reply to this email directly, view it on GitHub <#17172 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFAYJVOBUJMEVNFJNON7Y7DVJQUOVANCNFSM5RRHFBKQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>
@reuvenlax Kind ping, did you already get the chance to have another look at this? Just to make it clear, this is not about caching ... that was just a minor almost irrelevant part of this
mosche
changed the title
[BEAM-14166] Performance / Cache improvements to RowWithGetter
[BEAM-14166] Performance improvements for RowWithGetter
@reuvenlax - Would you be able to take another look?
I think this is fine to go ahead and merge. From what I can tell Reuven's original concern was that this would start caching more types than were already being cached in RowWithGetters, but this is just moving around the caching logic.
@TheNeuralBit Thanks a lot 🙏 I still owe you the comment on FieldValueGetter.getRaw, i'll follow up shortly!
mosche
deleted the
BEAM-14166-RowWithGetter
branch
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters