docs: document topic alias and semantic search APIs
This commit is contained in:
67
README.md
67
README.md
@@ -12,12 +12,15 @@ IdentityDB stores memory as a graph made of:
|
|||||||
|
|
||||||
A single fact like `I have worked with TypeScript since 2025.` can connect the topics `I`, `TypeScript`, and `2025` at the same time.
|
A single fact like `I have worked with TypeScript since 2025.` can connect the topics `I`, `TypeScript`, and `2025` at the same time.
|
||||||
|
|
||||||
## Current foundation capabilities
|
## Current capabilities
|
||||||
|
|
||||||
- SQLite, PostgreSQL, MySQL, and MariaDB connection adapters
|
- SQLite, PostgreSQL, MySQL, and MariaDB connection adapters
|
||||||
- Automatic schema initialization for `topics`, `facts`, and `fact_topics`
|
- Automatic schema initialization for `topics`, `facts`, `fact_topics`, `topic_relations`, `topic_aliases`, and `fact_embeddings`
|
||||||
- High-level APIs for adding topics and facts
|
- High-level APIs for adding topics and facts
|
||||||
- Query APIs for listing topics, loading topic-scoped facts, and finding connected facts/topics
|
- Topic hierarchy APIs for parent/child traversal and lineage lookup
|
||||||
|
- Topic alias and canonical resolution APIs so facts and queries can resolve alternate names
|
||||||
|
- Semantic fact indexing and search APIs built around provider-agnostic embeddings
|
||||||
|
- Dedup-aware ingestion hooks that can reuse an existing fact when a semantic near-duplicate is detected
|
||||||
- Pluggable fact extraction so callers can use a small LLM or a deterministic extractor
|
- Pluggable fact extraction so callers can use a small LLM or a deterministic extractor
|
||||||
|
|
||||||
## Install
|
## Install
|
||||||
@@ -29,7 +32,7 @@ bun install
|
|||||||
## Quick start
|
## Quick start
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
import { IdentityDB, NaiveExtractor } from 'identitydb';
|
import { IdentityDB, NaiveExtractor, type EmbeddingProvider } from 'identitydb';
|
||||||
|
|
||||||
const db = await IdentityDB.connect({
|
const db = await IdentityDB.connect({
|
||||||
client: 'sqlite',
|
client: 'sqlite',
|
||||||
@@ -58,15 +61,58 @@ await db.addFact({
|
|||||||
],
|
],
|
||||||
});
|
});
|
||||||
|
|
||||||
const topic = await db.getTopicByName('TypeScript', { includeFacts: true });
|
await db.linkTopics({
|
||||||
const connected = await db.findConnectedTopics('TypeScript');
|
parentName: 'programming language',
|
||||||
|
childName: 'TypeScript',
|
||||||
|
});
|
||||||
|
|
||||||
console.log(topic?.facts.map((fact) => fact.statement));
|
await db.addTopicAlias('TypeScript', 'TS');
|
||||||
|
|
||||||
|
const provider: EmbeddingProvider = {
|
||||||
|
model: 'example-embedding-v1',
|
||||||
|
dimensions: 3,
|
||||||
|
async embed(input) {
|
||||||
|
if (input.toLowerCase().includes('typescript')) {
|
||||||
|
return [1, 0, 0];
|
||||||
|
}
|
||||||
|
|
||||||
|
return [0, 1, 0];
|
||||||
|
},
|
||||||
|
};
|
||||||
|
|
||||||
|
await db.indexFactEmbeddings({ provider });
|
||||||
|
|
||||||
|
const topic = await db.getTopicByName('TS', { includeFacts: true });
|
||||||
|
const children = await db.getTopicChildren('programming language');
|
||||||
|
const lineage = await db.getTopicLineage('TS');
|
||||||
|
const connected = await db.findConnectedTopics('TypeScript');
|
||||||
|
const matches = await db.searchFacts({
|
||||||
|
query: 'TypeScript experience',
|
||||||
|
provider,
|
||||||
|
limit: 5,
|
||||||
|
});
|
||||||
|
|
||||||
|
console.log(topic?.name);
|
||||||
|
console.log(children.map((entry) => entry.name));
|
||||||
|
console.log(lineage.map((entry) => entry.name));
|
||||||
console.log(connected.map((entry) => [entry.name, entry.sharedFactCount]));
|
console.log(connected.map((entry) => [entry.name, entry.sharedFactCount]));
|
||||||
|
console.log(matches.map((entry) => [entry.statement, entry.score]));
|
||||||
|
|
||||||
await db.close();
|
await db.close();
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Semantic ingestion and duplicate detection
|
||||||
|
|
||||||
|
If you provide an embedding provider during ingestion, IdentityDB can index the new fact automatically and reuse an existing fact when a semantic near-duplicate is already present.
|
||||||
|
|
||||||
|
```ts
|
||||||
|
await db.ingestStatement('Bun makes TypeScript tooling fast.', {
|
||||||
|
extractor: new NaiveExtractor(),
|
||||||
|
embeddingProvider: provider,
|
||||||
|
duplicateThreshold: 0.95,
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
## Development
|
## Development
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -77,6 +123,9 @@ bun run build
|
|||||||
|
|
||||||
## Current status
|
## Current status
|
||||||
|
|
||||||
This repository is in active foundation development.
|
This repository is in active MVP expansion development.
|
||||||
|
|
||||||
See `docs/plans/2026-05-11-identitydb-foundation.md` for the current implementation plan.
|
See these implementation plans for the current roadmap:
|
||||||
|
|
||||||
|
- `docs/plans/2026-05-11-identitydb-foundation.md`
|
||||||
|
- `docs/plans/2026-05-11-identitydb-memory-expansion.md`
|
||||||
|
|||||||
Reference in New Issue
Block a user