Files
IdentityDB/docs/plans/2026-05-11-identitydb-memory-expansion.md

5.2 KiB

IdentityDB Memory Expansion Implementation Plan

For Hermes: Use the subagent-driven-development skill to execute this plan task-by-task. Enforce strict TDD for every production behavior.

Goal: Extend IdentityDB with explicit topic hierarchy, topic alias/canonicalization controls, and portable semantic fact search with embedding-backed similarity APIs.

Architecture: Keep the relational core portable across SQLite, PostgreSQL, MySQL, and MariaDB by introducing dedicated extension tables: topic_relations for abstract/concrete hierarchy, topic_aliases for canonical topic resolution, and fact_embeddings for semantic indexing. Expose high-level APIs from IdentityDB while preserving DB-agnostic behavior by doing semantic scoring in the application layer first.

Tech Stack: TypeScript, Bun, Node.js, Kysely, better-sqlite3, pg, mysql2, Vitest, tsup.


Scope and interpretation

  • Topic hierarchy must be explicit rather than inferred only from shared facts.
  • Canonical topics must remain first-class records in topics; aliases should resolve into those topics without duplicating canonical rows.
  • Semantic search must stay provider-agnostic through a pluggable EmbeddingProvider interface.
  • The first semantic-search release should favor portability and deterministic testing over ANN/vector-extension optimization.
  • Ingestion should be able to detect likely duplicate facts by semantic similarity without forcing automatic merges.

Data model additions

topic_relations

  • parent_topic_id
  • child_topic_id
  • relation — initially parent_of
  • created_at
  • composite primary key on (parent_topic_id, child_topic_id, relation)

topic_aliases

  • id
  • topic_id
  • alias
  • normalized_alias
  • is_primary
  • created_at
  • updated_at
  • unique key on normalized_alias

fact_embeddings

  • fact_id
  • model
  • dimensions
  • embedding
  • content_hash
  • created_at
  • updated_at
  • composite primary key on (fact_id, model)

Public API additions

Topic hierarchy

await db.linkTopics({
  parentName: 'programming language',
  childName: 'TypeScript',
});

await db.getTopicChildren('programming language');
await db.getTopicParents('TypeScript');
await db.getTopicLineage('TypeScript');

Topic aliases

await db.addTopicAlias('TypeScript', 'TS');
await db.resolveTopic('ts');
await db.getTopicAliases('TypeScript');
await db.indexFactEmbeddings({ provider });
await db.searchFacts({ query: 'When did I start using TS?', provider, limit: 5 });
await db.findSimilarFacts({ statement: 'I started using TypeScript in 2025.', provider, threshold: 0.9 });

Dedup-aware ingestion

await db.ingestStatement(statement, {
  extractor,
  dedup: {
    provider,
    threshold: 0.9,
  },
});

Execution plan

Task 1: Lock the extension schema and APIs with failing tests

Objective: Define tests for hierarchy, aliases, and semantic search before production code changes.

Files:

  • Modify: tests/migrations.test.ts
  • Modify: tests/identity-db.test.ts
  • Modify: tests/queries.test.ts
  • Create: tests/semantic-search.test.ts
  • Modify: src/types/api.ts
  • Modify: src/types/domain.ts
  • Modify: src/types/database.ts
  • Modify: src/core/schema.ts

Verification:

  • Run focused test commands and confirm they fail for missing behavior.

Task 2: Implement topic hierarchy storage and query APIs

Objective: Add topic_relations schema support plus parent/child/lineage APIs.

Files:

  • Modify: src/core/migrations.ts
  • Modify: src/core/identity-db.ts
  • Modify: src/core/utils.ts
  • Modify: src/queries/topics.ts
  • Modify: src/types/api.ts
  • Modify: src/types/domain.ts
  • Modify: src/types/database.ts

Verification:

  • Run hierarchy-focused tests until green.

Task 3: Implement canonical topic aliases

Objective: Add alias storage, alias-aware resolution, and canonical topic lookup semantics.

Files:

  • Modify: src/core/migrations.ts
  • Modify: src/core/identity-db.ts
  • Modify: src/queries/topics.ts
  • Modify: src/core/utils.ts
  • Modify: src/types/api.ts
  • Modify: src/types/domain.ts
  • Modify: src/types/database.ts

Verification:

  • Run alias-focused tests until green.

Objective: Add EmbeddingProvider, embedding storage, search APIs, and similarity ranking.

Files:

  • Create: src/embeddings/provider.ts
  • Create: src/queries/embeddings.ts
  • Modify: src/core/migrations.ts
  • Modify: src/core/identity-db.ts
  • Modify: src/core/utils.ts
  • Modify: src/types/api.ts
  • Modify: src/types/domain.ts
  • Modify: src/types/database.ts
  • Modify: src/index.ts
  • Create: tests/semantic-search.test.ts

Verification:

  • Run semantic-search tests until green.

Task 5: Add dedup-aware ingestion, docs, and full verification

Objective: Surface semantic dedup hints during ingestion, document the new APIs, and run the full suite.

Files:

  • Modify: src/ingestion/types.ts
  • Modify: src/core/identity-db.ts
  • Modify: README.md
  • Modify: src/index.ts

Verification:

  • Run bun run test && bun run check && bun run build
  • Update docs to reflect the new public surface.