Add MERGE ON CREATE SET / ON MATCH SET support (#1619) by gregfelice · Pull Request #2347 · apache/age
Implements the openCypher-standard ON CREATE SET and ON MATCH SET
clauses for the MERGE statement. This allows conditional property
updates depending on whether MERGE created a new path or matched
an existing one:
MERGE (n:Person {name: 'Alice'})
ON CREATE SET n.created = timestamp()
ON MATCH SET n.updated = timestamp()
Implementation spans parser, planner, and executor:
- Grammar: new merge_actions_opt/merge_actions/merge_action rules
in cypher_gram.y, with ON keyword added to cypher_kwlist.h
- Nodes: on_match/on_create lists on cypher_merge, corresponding
on_match_set_info/on_create_set_info on cypher_merge_information,
and prop_expr on cypher_update_item (all serialized through
copy/out/read funcs)
- Transform: cypher_clause.c transforms ON SET items and stores
prop_expr for direct expression evaluation
- Executor: cypher_set.c extracts apply_update_list() from
process_update_list(); cypher_merge.c calls it at all merge
decision points (simple merge, terminal, non-terminal with
eager buffering, and first-clause-with-followers paths)
Key design choice: prop_expr stores the Expr* directly in
cypher_update_item rather than using prop_position into the scan
tuple. The planner strips target list entries for SET expressions
that CustomScan doesn't need, making prop_position references
dangling. By storing the expression directly (only for MERGE ON
SET items), we evaluate it with ExecInitExpr/ExecEvalExpr
independent of the scan tuple layout.
Includes regression tests covering: basic ON CREATE SET, basic
ON MATCH SET, combined ON CREATE + ON MATCH, multiple SET items,
expression evaluation, interaction with WITH clause, and edge
property updates.
All 31 regression tests pass.
…sor test - Move ExecInitExpr for ON CREATE/MATCH SET items from per-row execution in apply_update_list() to plan initialization in begin_cypher_merge(). Follows the established pattern used by cypher_target_node (id_expr_state, prop_expr_state). - Add prop_expr_state field to cypher_update_item with serialization support in outfuncs/readfuncs/copyfuncs. - apply_update_list() uses pre-initialized state when available, falls back to per-row init for plain SET callers. - Fix misleading comment: "ON MATCH SET" → "ON CREATE SET" for Case 1 first-run test. - Add Case 1 second-run test that triggers ON MATCH SET with a predecessor clause (MATCH ... MERGE ... ON MATCH SET).
1. Add ON to safe_keywords in cypher_gram.y so that property keys and labels named 'on' still work (e.g., n.on, MATCH (n:on)). All other keywords added as tokens are also in safe_keywords. 2. Add chained (non-terminal) MERGE regression tests exercising the eager-buffering code path with ON CREATE SET and ON MATCH SET. First run creates both nodes (ON CREATE SET fires), second run matches both (ON MATCH SET fires). All regression tests pass (cypher_merge: ok).
…N keyword test - Move ExecStoreVirtualTuple before apply_update_list unconditionally in Case 1 non-terminal and terminal MERGE paths, matching the pattern at Case 3 (line 994). Ensures tts_nvalid is set for downstream ExecProject even when ON CREATE SET is absent. - Add resolve_merge_set_exprs() helper to deduplicate the prop_expr resolution loops for ON MATCH SET and ON CREATE SET. Includes ereport when target entry is missing (internal error, should never happen). - Add regression test for ON keyword as label name, confirming backward compatibility via safe_keywords grammar path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters