# IdentityDB Memory Expansion Implementation Plan > **For Hermes:** Use the `subagent-driven-development` skill to execute this plan task-by-task. Enforce strict TDD for every production behavior. **Goal:** Extend IdentityDB with explicit topic hierarchy, topic alias/canonicalization controls, and portable semantic fact search with embedding-backed similarity APIs. **Architecture:** Keep the relational core portable across SQLite, PostgreSQL, MySQL, and MariaDB by introducing dedicated extension tables: `topic_relations` for abstract/concrete hierarchy, `topic_aliases` for canonical topic resolution, and `fact_embeddings` for semantic indexing. Expose high-level APIs from `IdentityDB` while preserving DB-agnostic behavior by doing semantic scoring in the application layer first. **Tech Stack:** TypeScript, Bun, Node.js, Kysely, better-sqlite3, pg, mysql2, Vitest, tsup. --- ## Scope and interpretation - Topic hierarchy must be explicit rather than inferred only from shared facts. - Canonical topics must remain first-class records in `topics`; aliases should resolve into those topics without duplicating canonical rows. - Semantic search must stay provider-agnostic through a pluggable `EmbeddingProvider` interface. - The first semantic-search release should favor portability and deterministic testing over ANN/vector-extension optimization. - Ingestion should be able to detect likely duplicate facts by semantic similarity without forcing automatic merges. --- ## Data model additions ### `topic_relations` - `parent_topic_id` - `child_topic_id` - `relation` — initially `parent_of` - `created_at` - composite primary key on (`parent_topic_id`, `child_topic_id`, `relation`) ### `topic_aliases` - `id` - `topic_id` - `alias` - `normalized_alias` - `is_primary` - `created_at` - `updated_at` - unique key on `normalized_alias` ### `fact_embeddings` - `fact_id` - `model` - `dimensions` - `embedding` - `content_hash` - `created_at` - `updated_at` - composite primary key on (`fact_id`, `model`) --- ## Public API additions ### Topic hierarchy ```ts await db.linkTopics({ parentName: 'programming language', childName: 'TypeScript', }); await db.getTopicChildren('programming language'); await db.getTopicParents('TypeScript'); await db.getTopicLineage('TypeScript'); ``` ### Topic aliases ```ts await db.addTopicAlias('TypeScript', 'TS'); await db.resolveTopic('ts'); await db.getTopicAliases('TypeScript'); ``` ### Semantic indexing and search ```ts await db.indexFactEmbeddings({ provider }); await db.searchFacts({ query: 'When did I start using TS?', provider, limit: 5 }); await db.findSimilarFacts({ statement: 'I started using TypeScript in 2025.', provider, threshold: 0.9 }); ``` ### Dedup-aware ingestion ```ts await db.ingestStatement(statement, { extractor, dedup: { provider, threshold: 0.9, }, }); ``` --- ## Execution plan ### Task 1: Lock the extension schema and APIs with failing tests **Objective:** Define tests for hierarchy, aliases, and semantic search before production code changes. **Files:** - Modify: `tests/migrations.test.ts` - Modify: `tests/identity-db.test.ts` - Modify: `tests/queries.test.ts` - Create: `tests/semantic-search.test.ts` - Modify: `src/types/api.ts` - Modify: `src/types/domain.ts` - Modify: `src/types/database.ts` - Modify: `src/core/schema.ts` **Verification:** - Run focused test commands and confirm they fail for missing behavior. ### Task 2: Implement topic hierarchy storage and query APIs **Objective:** Add `topic_relations` schema support plus parent/child/lineage APIs. **Files:** - Modify: `src/core/migrations.ts` - Modify: `src/core/identity-db.ts` - Modify: `src/core/utils.ts` - Modify: `src/queries/topics.ts` - Modify: `src/types/api.ts` - Modify: `src/types/domain.ts` - Modify: `src/types/database.ts` **Verification:** - Run hierarchy-focused tests until green. ### Task 3: Implement canonical topic aliases **Objective:** Add alias storage, alias-aware resolution, and canonical topic lookup semantics. **Files:** - Modify: `src/core/migrations.ts` - Modify: `src/core/identity-db.ts` - Modify: `src/queries/topics.ts` - Modify: `src/core/utils.ts` - Modify: `src/types/api.ts` - Modify: `src/types/domain.ts` - Modify: `src/types/database.ts` **Verification:** - Run alias-focused tests until green. ### Task 4: Implement embedding-backed indexing and semantic search **Objective:** Add `EmbeddingProvider`, embedding storage, search APIs, and similarity ranking. **Files:** - Create: `src/embeddings/provider.ts` - Create: `src/queries/embeddings.ts` - Modify: `src/core/migrations.ts` - Modify: `src/core/identity-db.ts` - Modify: `src/core/utils.ts` - Modify: `src/types/api.ts` - Modify: `src/types/domain.ts` - Modify: `src/types/database.ts` - Modify: `src/index.ts` - Create: `tests/semantic-search.test.ts` **Verification:** - Run semantic-search tests until green. ### Task 5: Add dedup-aware ingestion, docs, and full verification **Objective:** Surface semantic dedup hints during ingestion, document the new APIs, and run the full suite. **Files:** - Modify: `src/ingestion/types.ts` - Modify: `src/core/identity-db.ts` - Modify: `README.md` - Modify: `src/index.ts` **Verification:** - Run `bun run test && bun run check && bun run build` - Update docs to reflect the new public surface.