Clone
4
Home
Shinwoo PARK edited this page 2026-05-11 15:02:55 +09:00

IdentityDB

IdentityDB exists to make AI memory explicit, queryable, portable, and evolvable.

Most AI applications start by stuffing raw text into prompts, vector stores, or ad-hoc JSON blobs. That works for demos, but it becomes fragile when you need to answer questions like:

  • What facts do we know about a person, product, or project?
  • Which topics are connected by the same statement?
  • Can we distinguish canonical concepts from aliases such as TypeScript and TS?
  • Can we preserve memory across SQLite locally and PostgreSQL or MySQL in production?
  • Can we mix deterministic extraction, LLM-backed extraction, and semantic search without locking into one vendor?

IdentityDB is designed as the answer to those problems.

Why IdentityDB exists

IdentityDB turns memory into a relational graph with a stable application API:

  • Spaces isolate independent memory graphs such as A and B so they behave like separate dimensions
  • Topics are named nodes such as TypeScript, Bun, 2025, or programming language
  • Facts are statements such as I have worked with TypeScript since 2025.
  • Fact-topic links connect one fact to many topics, which lets a single statement become a graph edge between concepts
  • Topic relations model explicit hierarchy such as programming language -> TypeScript
  • Topic aliases model canonicalization such as TS -> TypeScript
  • Fact embeddings enable provider-agnostic semantic search and duplicate detection

This gives you a memory system that is easier to inspect than a black-box vector index and easier to evolve than hard-coded prompt state.

What the package can do today

  • Connect to SQLite, PostgreSQL, MySQL, and MariaDB
  • Initialize the required schema automatically
  • Add facts and topics directly through a typed API
  • Split memory into hard-isolated spaces so one tenant, person, or project cannot accidentally connect to another
  • Ingest free-form text through pluggable extractors
  • Resolve aliases to canonical topics
  • Traverse parent/child topic relationships
  • Index facts with embeddings for semantic retrieval
  • Reuse an existing fact when semantic duplicate detection says a new statement is effectively the same memory

Core idea in one example

The fact:

I have worked with TypeScript since 2025.

can connect all of these topics at once:

  • I
  • TypeScript
  • 2025

That means IdentityDB can answer more than plain keyword lookup. It can tell you:

  • which facts connect TypeScript and 2025
  • which topics are related to TypeScript
  • which alias should resolve to the same canonical topic
  • which facts are semantically similar even if the wording changes
  • Getting Started — installation, initialization, and concrete examples
  • API Reference — the exported package surface, method signatures, and return types
  • Memory Spaces — how to keep separate memory graphs isolated
  • Extractors — when to use NaiveExtractor vs LlmFactExtractor

Repository

Current direction

IdentityDB is still in active MVP expansion, but the current shape is already useful for:

  • structured long-term memory for agents
  • knowledge capture from conversations
  • portable memory graphs across databases
  • inspectable semantic memory systems