Add GROUP BY and ORDER BY optimization by jrgemignani · Pull Request #2287 · apache/age
NOTE: This PR was created with AI tools and a human. Add GROUP BY and ORDER BY optimization for vertex/edge field access. Transform expressions like age_id(_agtype_build_vertex(id, label, props)) into graphid_to_agtype(id) in GROUP BY and ORDER BY clauses, avoiding unnecessary vertex/edge reconstruction when only the ID is needed. Implementation: - Add optimize_sortgroupby_vertex_access() in cypher_clause.c - Walk target entries with non-zero ressortgroupref (GROUP BY/ORDER BY refs) - Detect outer accessor functions: age_id, age_start_id, age_end_id, age_properties - Match inner build functions: _agtype_build_vertex, _agtype_build_edge - Extract the relevant field directly and wrap with graphid_to_agtype() - Add resjunk target entries to subquery for direct field access Supported patterns: - GROUP BY id(v) -> Group Key: graphid_to_agtype(v.id) - GROUP BY start_id(e) -> Group Key: graphid_to_agtype(e.start_id) - GROUP BY end_id(e) -> Group Key: graphid_to_agtype(e.end_id) - ORDER BY id(v) -> Sort Key: graphid_to_agtype(v.id) - ORDER BY start_id(e) -> Sort Key: graphid_to_agtype(e.start_id) - ORDER BY end_id(e) -> Sort Key: graphid_to_agtype(e.end_id) - Combined ORDER BY + GROUP BY This complements existing optimizations: - cypher_expr.c: optimize_vertex_field_access() for direct FuncExpr patterns - cypher_clause.c: optimize_qual_expr_mutator() for WHERE/join conditions Existing regression tests were not affected. Added additional regression tests. modified: regress/expected/unified_vertex_table.out modified: regress/sql/unified_vertex_table.sql modified: src/backend/parser/cypher_clause.c