What it’s good for
The easiest local setup. LM Studio has a built-in model browser, one-click downloads, and runs an OpenAI-compatible server out of the box. Great if you want local models without managing GGUF files and command-line flags.Requirements
- LM Studio installed
- A model downloaded through LM Studio’s model browser
- The local server started in LM Studio (Developer tab)
Configure in Spaceduck
Chat
Start the LM Studio server
- Open LM Studio
- Load a model (e.g., Qwen3, Llama 3.1, DeepSeek)
- Go to the Developer tab
- Click Start Server — it runs at
http://localhost:1234by default
Configure Spaceduck
In Settings > Chat:
- Provider: LM Studio
- Base URL:
http://localhost:1234/v1 - Model: the model identifier shown in LM Studio (e.g.,
qwen/qwen3-4b-thinking-2507)
LM Studio doesn’t require an API key by default. Spaceduck sends a dummy key (
lm-studio) for servers that require the Authorization header to be present.Embeddings
LM Studio can serve embeddings from the same server, or you can run a second instance on a different port.Load an embedding model
In LM Studio, load an embedding model alongside your chat model, or start a second server instance on a different port.
Configure Spaceduck
In Settings > Memory:
- Toggle Semantic recall on
- Provider: LM Studio
- Server URL:
http://localhost:1234/v1(same server, or a different port if running separately) - Model: your embedding model identifier
- Dimensions: match the model (e.g., 768 for nomic, 1024 for large models)
Test and troubleshoot
| Problem | Cause | Fix |
|---|---|---|
ECONNREFUSED on port 1234 | LM Studio server not started | Go to Developer tab and click Start Server |
| Model not found | Model identifier doesn’t match | Check curl localhost:1234/v1/models for the exact name |
| Slow responses | Model too large for available RAM | Try a smaller model or lower quantization |
<think> tags in output | Thinking model (Qwen3, DeepSeek) | Spaceduck strips these automatically — this is expected |
