Implement multiple label by rafsun42 · Pull Request #2082 · apache/age

added 11 commits

August 28, 2024 10:53
It represents label expression of different type: empty, single or multiple.
Previously, label field was char* type.

The change affected the type cypher_node, cypher_relationship
and cypher_target_node. As well as, any places where these
types are used.
Supports queries like-
    MATCH (v:A|B|C) RETURN v
    MATCH ()-[e:A|B|C]->() RETURN v
Some examples of supported multiple label queries:
	CREATE (:a:b)
	MERGE  (:a:b)
	MATCH  (:a:b)
	MATCH  (:a|b)

See regress/sql/multiple_label.sql for more details on what kind
of queries are supported.

Change summary:
---------------
* A new column `allrelations` is added to ag_label catalog
* Change in creating AGE relations logic
* Change in MATCH's transformation logic (related to building parse
  namespace item)
The logic for building vertex objects is updated. Agtype vertex objects can be
built from either a single label (as a cstring) or multiple labels (as an
agtype array). The following functions are updated to reflect this-
agtype_typecast_vertex, agtype_in and _agtype_build_vertex. if
_agtype_build_vertex is called from SQL, its label argument must be explicitly
cast to avoid ambiguity in function overload.

The `_label_names` function is added to extract label names from a vertex ID
as a list of string. It is used as a helper function to build vertex objects.
A new cache called `allrelations` is also added. This is used by _label_names
to search for all labels that are related to a given relation.

Multiple helper functions are added to extract label infromation from an entity
ID. For example, entity's relation ID, relation name, label names. These are
used by CREATE, DELETE, MERGE, VLE and SET executors for building a vertex's
object or updating its relation.

All test files are updated to show the label field as an array in the output.
In all test SQLs, _agtype_build_vertex's label argument is explicity cast.
It updates the function filter_vertices_on_label_id().

Additional changes:
-------------------
 - Add internal function _label_ids
Cache issues fixed:
-------------------
  - Use of wrong data type for cache entry in label relation cache (pre-existing)
  - Use of wrong update function for catalog table (related to multiple label)

Other changes:
--------------
  - The function _label_name() is unsupported for vertices
Changes:
--------
 - Update create_label_expr_relations() to return RangeVar. It removes
   redundant call to label_expr_relname() in the code that also calls
   this function.

 - Use deconstruct_array() to convert ArrayType* to List*

 - Update test files after rebase
This fixes some compile-time errors that occur if
PostgreSQL is configured with the --with-llvm option.
Changes:
-------
  * Include missing header files
  * Update newly added tests
  * Other minor changes
Following PRs are reapplied: 1465, 1509, 1514, and 1518.

@rafsun42 rafsun42 changed the base branch from multiple_label_16 to master

August 29, 2024 19:52

@rafsun42 rafsun42 changed the base branch from master to multiple_label_16

August 29, 2024 19:53

@rafsun42 rafsun42 changed the base branch from multiple_label_16 to master

August 29, 2024 19:56

@rafsun42

* Regression tests
* Comments
* Minor overlook during rebasing

@rafsun42 @MuhammadTahaNaveed

This will create a distinction in intersection relation names between the following
combinations: (MN:P) and (M:N:P). The credit for finding out this edge case goes
to Muhammad Taha Naveed.

Additionally, a parameter is added to the create_label function to skip checking
valid label names. This is useful when creating intersection relations (the name
can be invalid due to containing the separator symbol).

Co-authored-by: Muhammad Taha Naveed <mtaha@apache.org>

MuhammadTahaNaveed