Open Source · MIT Licensed

The persistence and retrieval layer your AI pipeline is missing.

Your AI pipeline deserves a purpose-built persistence layer. RecallDB is an opinionated schema on top of pgvector that stores embeddings alongside rich metadata, labels, tags, and raw content — all behind a single REST API.

terminal
$ docker compose up
# API at localhost:8600, Dashboard at localhost:8601

$ curl -X PUT http://localhost:8600/v1.0/tenants/ten_default/collections/col_default/documents \
  -H "Authorization: Bearer recalldbadmin" \
  -d '{
    "DocumentId": "readme-guide",
    "ContentType": "Text",
    "Content": "RecallDB stores embeddings alongside rich metadata.",
    "Embeddings": [0.1, 0.2, 0.3],
    "Labels": ["documentation", "guide"],
    "Tags": [{"Key": "source", "Value": "readme"}]
  }'
40+ REST Endpoints
5 Distance Metrics
3 SDKs
10 Filter Operators
9 Content Types
3 Search Modes

You've built this before.
You shouldn't have to again.

Bolt pgvector onto Postgres

Install extensions, write migration scripts, design a schema — all before you can store your first embedding.

Hand-roll a chunking schema

Documents, chunks, positions, metadata tables. Every team builds their own, and it's hard to carry forward to the next project.

Build an API layer with auth

REST endpoints, bearer tokens, multi-tenant isolation, RBAC. Important work, but it's undifferentiated infrastructure for your team.

Maintain it all yourself

The result works, but it's tightly coupled to one project — and the next project needs something slightly different.

Store the complete context your AI needs.

Most vector databases focus on embeddings alone. RecallDB is the perfect partner to pgvector. It stores everything your retrieval pipeline needs to find, rank, and act on information.

Vector Embeddings

Semantic similarity search via pgvector with dedicated HNSW indexing per collection. No noisy-neighbor problems.

Raw Content

Full text, code, tables, lists, hyperlinks, binary data, images. Your content lives right next to its embeddings.

9 Content Types

Text, HTML, JSON, XML, CSV, SQL, code, table, and binary. Your retrieval pipeline knows what it's looking at.

Chunk Positions

Ordered document segments with document_id + position grouping. Reconstruct full documents or navigate by chunk.

Multi-Modal Search

Query across content, metadata (labels and key-value tags with 10 operators), and vector embeddings — all in a single compound request.

SHA256 Hashes & ETags

Deduplication, cache invalidation, and change detection out of the box. No extra plumbing required.

Built on Postgres. Powered by pgvector.

No exotic infrastructure. No proprietary lock-in. Just Postgres with the pgvector extension, wrapped in an opinionated schema and a clean REST API. Access your database directly at any time.

Your Application
C# / Python / JavaScript SDK
RecallDB API Server
40+ REST endpoints · Auth · Multi-tenant
React Dashboard
Visual management · Query builder
PostgreSQL + pgvector
Per-collection tables · HNSW indexes · Labels · Tags

Per-Collection Isolation

Each collection creates its own Postgres tables with a dedicated HNSW vector index. Labels and tags are stored in separate relational tables, keeping the vector index lean.

Multi-Tenant by Design

Tenants, users, credentials, and collections are fully scoped. One deployment serves many clients with complete data isolation between tenants.

Bring Your Own Embeddings

No vendor lock-in. Use OpenAI, Cohere, Ollama, Voyage, or anything that outputs a float array. RecallDB stores and indexes them all the same.

Art of the Possible with RecallDB.

Conversational AI & Chatbots

Store conversation history, knowledge base articles, and FAQ embeddings. Retrieve contextually relevant answers with compound filters on topic labels and recency.

Document Intelligence

Chunk PDFs, contracts, and reports with position tracking. Search across thousands of documents with label-scoped vector queries and tag-based metadata filters.

Conversational Memory

Persist chat history with per-session labels and user metadata tags. Retrieve relevant past exchanges with vector similarity, full-text relevance, or hybrid search and temporal filters for context-aware responses.

Multi-Modal Content

Store image embeddings, audio transcripts, and video descriptions alongside their raw content. Typed content categories let your pipeline handle each format correctly.

Multi-Tenant SaaS

Build AI-powered features for your SaaS product. Each customer gets fully isolated tenants with their own collections, users, and credentials. One deployment, many clients.

Enterprise Knowledge Bases

Centralize organizational knowledge with rich metadata. Filter by department, access level, document type, and date range while leveraging semantic, full-text, or hybrid search.

Everything you need. Nothing you don't.

Multi-tenant isolation

Tenants, users, credentials, and collections are fully scoped.

Per-collection HNSW indexes

Dedicated vector indexes with no noisy-neighbor problems.

5 distance metrics

Cosine, Euclidean, inner product — similarity and distance variants.

Compound search queries

Vector, full-text, and hybrid search + labels + tags + terms + dates in one request.

Bring your own embeddings

OpenAI, Cohere, Ollama, or any float array. No vendor lock-in.

Batch document operations

Bulk ingest documents in a single API call for high-throughput pipelines.

React dashboard

Manage tenants, collections, and documents visually. Search with a query builder.

Docker Compose deployment

Postgres + pgvector, API server, and dashboard in one command.

Drop into your stack in minutes.

Typed clients for C#, Python, and JavaScript. Full CRUD and search operations with zero boilerplate.

terminal
$ curl -X POST http://localhost:8600/v1.0/tenants/ten_default/collections/col_default/search \
  -H "Authorization: Bearer recalldbadmin" \
  -H "Content-Type: application/json" \
  -d '{
    "Vector": {
      "SearchType": "CosineSimilarity",
      "Embeddings": [0.1, 0.2, 0.3],
      "MinimumScore": 0.7
    },
    "LabelFilter": {
      "Required": ["documentation"]
    },
    "Terms": {
      "Required": ["metadata"]
    },
    "MaxResults": 10
  }'
Program.cs
using RecallDb.Sdk;
using RecallDb.Sdk.Models;

var client = new RecallDbClient("http://localhost:8600", "recalldbadmin");

// Store a document with embeddings and metadata
await client.CreateDocumentAsync("ten_default", "col_default", new DocumentRecord
{
    DocumentId = "readme-guide",
    ContentType = "Text",
    Content = "RecallDB stores embeddings alongside rich metadata.",
    Embeddings = new List<float> { 0.1f, 0.2f, 0.3f },
    Labels = new List<string> { "documentation", "guide" }
});

// Search with compound filters
SearchResult results = await client.SearchAsync("ten_default", "col_default", new SearchQuery
{
    Vector = new VectorQuery
    {
        SearchType = "CosineSimilarity",
        Embeddings = new List<float> { 0.1f, 0.2f, 0.3f },
        MinimumScore = 0.7
    },
    MaxResults = 10
});
main.py
from recalldb_sdk import RecallDbClient

client = RecallDbClient("http://localhost:8600", "recalldbadmin")

# Store a document with embeddings and metadata
client.create_document("ten_default", "col_default", {
    "DocumentId": "readme-guide",
    "ContentType": "Text",
    "Content": "RecallDB stores embeddings alongside rich metadata.",
    "Embeddings": [0.1, 0.2, 0.3],
    "Labels": ["documentation", "guide"]
})

# Search with compound filters
results = client.search("ten_default", "col_default", {
    "Vector": {
        "SearchType": "CosineSimilarity",
        "Embeddings": [0.1, 0.2, 0.3],
        "MinimumScore": 0.7
    },
    "MaxResults": 10
})
index.js
const { RecallDbClient } = require('recalldb-sdk');

const client = new RecallDbClient('http://localhost:8600', 'recalldbadmin');

// Store a document with embeddings and metadata
await client.createDocument('ten_default', 'col_default', {
  DocumentId: 'readme-guide',
  ContentType: 'Text',
  Content: 'RecallDB stores embeddings alongside rich metadata.',
  Embeddings: [0.1, 0.2, 0.3],
  Labels: ['documentation', 'guide']
});

// Search with compound filters
const results = await client.search('ten_default', 'col_default', {
  Vector: {
    SearchType: 'CosineSimilarity',
    Embeddings: [0.1, 0.2, 0.3],
    MinimumScore: 0.7,
  },
  MaxResults: 10,
});

40+ endpoints. Full CRUD. OpenAPI & Swagger.

A complete REST API covering every resource in the system. Bearer token authentication, admin and user scopes, and full OpenAPI/Swagger documentation out of the box.

Tenants

GET /v1.0/tenants
PUT /v1.0/tenants
GET /v1.0/tenants/{id}
DEL /v1.0/tenants/{id}

Collections

GET /v1.0/.../collections
PUT /v1.0/.../collections
GET /v1.0/.../collections/{id}
DEL /v1.0/.../collections/{id}

Documents

GET /v1.0/.../documents
PUT /v1.0/.../documents
POST /v1.0/.../documents/batch
DEL /v1.0/.../documents/{key}

Search

POST /v1.0/.../search
POST /v1.0/.../enumerate

Users & Credentials

GET /v1.0/.../users
PUT /v1.0/.../credentials

Up and running in 60 seconds.

1

Clone & Launch

$ git clone https://github.com/jchristn/RecallDB.git
$ cd RecallDB/docker
$ docker compose up

Postgres + pgvector, the API server, and the React dashboard all come up together.

2

Store Documents

$ curl -X PUT \
  localhost:8600/v1.0/.../documents \
  -H "Authorization: Bearer recalldbadmin" \
  -d '{
    "DocumentId": "doc1",
    "ContentType": "Text",
    "Content": "Hello world",
    "Embeddings": [0.1, 0.2, 0.3]
  }'

A default tenant and collection are created on first boot. Start storing documents immediately.

3

Search

$ curl -X POST \
  localhost:8600/v1.0/.../search \
  -H "Authorization: Bearer recalldbadmin" \
  -d '{
    "Vector": {
      "SearchType": "CosineSimilarity",
      "Embeddings": [0.1, 0.2, 0.3],
      "MinimumScore": 0.5
    },
    "MaxResults": 10
  }'

Combine vector similarity, full-text relevance, or hybrid search with labels, tags, terms, and date filters in a single query.

Default Credentials

Admin API Key recalldbadmin
User Login admin@recall / password
Bearer Token default
Server http://localhost:8600
Dashboard http://localhost:8601

Spend your time on what makes your product unique.

RecallDB gives you a complete persistence and retrieval architecture out of the box — so you can focus on building the features that matter.