Define APIs and models for new episodic memory by edwinyyyu · Pull Request #1199

Define APIs and models for new episodic memory by edwinyyyu · Pull Request #1199 · MemMachine/MemMachine

Motivation

current DeclarativeMemory does not handle multimodal content
current DeclarativeMemory does not handle chunking
current DeclarativeMemory produces high fan out in number of Neo4j queries
current DeclarativeMemory operations are not very atomic
VectorGraphStore is difficult to implement using other databases
we have a new VectorStore interface defined previously to move toward a solution
tangential: current DeclarativeMemory may face problems with top-k redundancy from vector search

Designed to scale better than existing Neo4j implementation -- need concrete implementation to verify.

Changes:

Approach to ingestion:

derivatives may or may not be deduplicated/consolidated
- vector search hits a representative, which has multiple segments linked to it
- some filters may be applied before, some filters may be applied after -- depending on whether the filter changes semantic meaning

Approach to search:

derivatives are no longer filterable
for low cardinality filters, we can get all entries from DB then do a brute force vector search
for high cardinality filters, we can do a vector DB search then post-filter
for no filters, we can do a vector DB search

Approach to deletion:

API designed with SQL in mind
use reference counting, active and purging states
- purging state necessary as a lock to allow deleting from external DB (vector DB)

Alternatives considered:

return UUIDs of vectors to delete when vectors become orphaned: synchronous deletion does not scale well -- we will instead implement a garbage collection job -- delete will be soft until garbage collection job

Decisions to make:

TODO:

implement new memory (basically the same basic logic as DeclarativeMemory but should be better)
change content to discriminated union or similar

Will eventually be breaking or require new API.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactor (does not change functionality, e.g., code style improvements, linting)
Documentation update
Project Maintenance (updates to build scripts, CI, etc., that do not affect the main project)
Security (improves security without changing functionality)

APIs and data models not tested.