Why Spaceduck
Persistent memory
Hybrid recall (vector + keyword) finds what you said even when you don’t use the same words. Facts are extracted eagerly after every response.
Provider freedom
Swap between local models (llama.cpp, LM Studio) and cloud providers (Bedrock, Gemini, OpenRouter) from the Settings UI — no restart required.
Agentic tools
Web search, browser automation, document scanning, and HTTP fetch. The agent loop chains tool calls automatically.
Multi-channel
Web UI, Desktop app (Tauri), WhatsApp, and CLI. Same memory, same tools, any surface.
How it works
Spaceduck runs a local gateway server that connects your chosen AI model to persistent memory, tools, and channels. Every conversation flows through the gateway. The gateway manages context budgets, extracts facts into long-term memory, embeds them for semantic recall, and orchestrates tool calls.Key features
- Hybrid recall — Reciprocal Rank Fusion combining vector cosine similarity and FTS5 BM25, with recency decay
- Slot-based identity —
name,age,locationslots with transactional upsert and automatic deactivation of stale facts - Contamination guard — assistant-generated text can never overwrite your identity
- Hot-swap providers — change your chat model or embedding model at runtime from the Settings UI or CLI
- Two-server pattern — run chat and embeddings on separate endpoints (local, cloud, or mixed)
- Eager extraction — facts are persisted after every turn, not only at compaction
