Clone
3
Getting Started
Shinwoo PARK edited this page 2026-05-11 15:02:55 +09:00

Getting Started

This page shows the concrete workflow for using IdentityDB as a structured memory layer.

For the full exported package surface, see API Reference.

1. Connect to a database

IdentityDB supports SQLite, PostgreSQL, MySQL, and MariaDB through Kysely-backed adapters.

In-memory SQLite example

import { IdentityDB } from 'identitydb';

const db = await IdentityDB.connect({
  client: 'sqlite',
  filename: ':memory:',
});

2. Initialize the schema

await db.initialize();

This creates the tables IdentityDB needs:

  • spaces
  • topics
  • facts
  • fact_topics
  • topic_relations
  • topic_aliases
  • fact_embeddings

3. Create or use isolated memory spaces

If you want independent memory graphs for different people, tenants, projects, or contexts, use spaces.

await db.upsertSpace({ name: 'A' });
await db.upsertSpace({ name: 'B' });

const spaces = await db.listSpaces();
const alpha = await db.getSpaceByName('A');

Any write or read can then opt into a specific space with spaceName.

See Memory Spaces for the full model.

4. Add structured facts directly

Use addFact() when your application already knows the topics it wants to attach.

await db.addFact({
  statement: 'TypeScript is a programming language.',
  topics: [
    {
      name: 'TypeScript',
      category: 'entity',
      granularity: 'concrete',
    },
    {
      name: 'programming language',
      category: 'concept',
      granularity: 'abstract',
    },
  ],
});

5. Model topic hierarchy explicitly

Use linkTopics() when you want hierarchy to be explicit rather than inferred.

await db.linkTopics({
  parentName: 'programming language',
  childName: 'TypeScript',
});

const children = await db.getTopicChildren('programming language');
const lineage = await db.getTopicLineage('TypeScript');

This is useful for reasoning such as:

  • TypeScript is a kind of programming language
  • Bun is a kind of runtime
  • PostgreSQL is a kind of database

6. Add aliases for canonical topic resolution

await db.addTopicAlias('TypeScript', 'TS');

const canonicalTopic = await db.getTopicByName('TS', { includeFacts: true });

This keeps one canonical topic row while still allowing alternate spellings or shorthand forms.

7. Ingest free-form text through an extractor

When your application starts from raw text, use ingestStatement().

Deterministic local example

import { NaiveExtractor } from 'identitydb';

await db.ingestStatement('I have worked with TypeScript since 2025.', {
  extractor: new NaiveExtractor(),
});

LLM-backed example

import { LlmFactExtractor } from 'identitydb';

const extractor = new LlmFactExtractor({
  model: {
    async generateText(prompt) {
      return callYourFavoriteLlm(prompt);
    },
  },
  instructions: 'Prefer technology, product, and time topics over generic nouns.',
});

await db.ingestStatement('I have worked with Bun and TypeScript since 2025.', {
  extractor,
});

See Extractors for a deeper explanation of the trade-offs.

IdentityDB keeps semantic search provider-agnostic through an EmbeddingProvider interface.

import type { EmbeddingProvider } from 'identitydb';

const provider: EmbeddingProvider = {
  model: 'example-embedding-v1',
  dimensions: 3,
  async embed(input) {
    if (input.toLowerCase().includes('typescript')) {
      return [1, 0, 0];
    }

    return [0, 1, 0];
  },
};

await db.indexFactEmbeddings({ provider });

const matches = await db.searchFacts({
  query: 'TypeScript experience',
  provider,
  limit: 5,
});

9. Enable duplicate-aware ingestion

If you also provide an embedding provider during ingestion, IdentityDB can check whether a semantically similar fact already exists.

await db.ingestStatement('Bun makes TypeScript tooling fast.', {
  extractor: new NaiveExtractor(),
  embeddingProvider: provider,
  duplicateThreshold: 0.95,
});

If a close enough match already exists, IdentityDB can return the existing fact instead of writing a duplicate.

10. Close the connection

await db.close();

Practical workflow recommendation

A good default integration pattern is:

  1. Start with SQLite in development
  2. Use NaiveExtractor for tests and deterministic local examples
  3. Introduce LlmFactExtractor when you need better topic extraction from messy natural language
  4. Add embeddings only when you actually need semantic retrieval or duplicate detection
  5. Move to PostgreSQL or MySQL/MariaDB later without changing the high-level API