API Reference
Complete endpoint documentation for the MemoryRouter API with native support for OpenAI, Anthropic, and Google.
Base URL: https://api.memoryrouter.ai
MemoryRouter provides native endpoints for each major AI provider. Use your provider's SDK with zero code changes—just swap the base URL.
Provider Endpoints
| Provider | Endpoint | SDK Compatibility |
|---|---|---|
| OpenAI, xAI, DeepSeek, Mistral, Cerebras, OpenRouter | POST /v1/chat/completions | OpenAI SDK |
| Anthropic | POST /v1/messages | Anthropic SDK |
| Google Gemini | POST /v1/models/:model:generateContent | Google AI SDK |
OpenAI-Compatible Endpoint
POST /v1/chat/completions
Works with OpenAI SDK and any OpenAI-compatible provider (xAI, DeepSeek, Mistral, Cerebras, OpenRouter).
curl:
curl -X POST https://api.memoryrouter.ai/v1/chat/completions \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-5.1",
"messages": [{"role": "user", "content": "My name is Alice"}]
}'Python (OpenAI SDK):
from openai import OpenAI
client = OpenAI(
base_url="https://api.memoryrouter.ai/v1",
api_key="mk_xxxxxxxxxxxxxxxx"
)
response = client.chat.completions.create(
model="openai/gpt-5.1",
messages=[{"role": "user", "content": "My name is Alice"}]
)
print(response.choices[0].message.content)TypeScript:
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.memoryrouter.ai/v1',
apiKey: 'mk_xxxxxxxxxxxxxxxx'
});
const response = await client.chat.completions.create({
model: 'openai/gpt-5.1',
messages: [{ role: 'user', content: 'My name is Alice' }]
});Provider-Native Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model identifier (e.g., openai/gpt-5.1, anthropic/claude-sonnet-5) |
messages | array | Yes | Array of message objects |
stream | boolean | No | Enable streaming |
temperature | number | No | Sampling temperature (0-2) |
max_tokens | number | No | Maximum tokens to generate |
top_p | number | No | Nucleus sampling |
frequency_penalty | number | No | Frequency penalty (-2 to 2) |
presence_penalty | number | No | Presence penalty (-2 to 2) |
stop | string/array | No | Stop sequences |
Response:
{
"id": "chatcmpl-abc123",
"choices": [{
"message": {"role": "assistant", "content": "Nice to meet you, Alice!"},
"finish_reason": "stop"
}],
"usage": {"prompt_tokens": 12, "completion_tokens": 15, "total_tokens": 27}
}Anthropic Native Endpoint
POST /v1/messages
Native Anthropic format. Use the Anthropic SDK directly—MemoryRouter accepts Anthropic's request format and returns Anthropic's response format unchanged.
curl:
curl -X POST https://api.memoryrouter.ai/v1/messages \
-H "x-api-key: mk_xxxxxxxxxxxxxxxx" \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-5",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "My name is Alice"}]
}'Python (Anthropic SDK):
from anthropic import Anthropic
client = Anthropic(
base_url="https://api.memoryrouter.ai",
api_key="mk_xxxxxxxxxxxxxxxx"
)
response = client.messages.create(
model="claude-sonnet-5",
max_tokens=1024,
messages=[{"role": "user", "content": "My name is Alice"}]
)
print(response.content[0].text)TypeScript:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
baseURL: 'https://api.memoryrouter.ai',
apiKey: 'mk_xxxxxxxxxxxxxxxx'
});
const response = await client.messages.create({
model: 'claude-sonnet-5',
max_tokens: 1024,
messages: [{ role: 'user', content: 'My name is Alice' }]
});Provider-Native Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Anthropic model (e.g., claude-sonnet-5) |
messages | array | Yes | Array of message objects |
max_tokens | number | Yes | Maximum tokens to generate |
system | string | No | System prompt |
temperature | number | No | Sampling temperature (0-1) |
top_p | number | No | Nucleus sampling |
top_k | number | No | Top-k sampling |
stop_sequences | array | No | Stop sequences |
stream | boolean | No | Enable streaming |
Supported Models:
claude-opus-4.5claude-sonnet-5claude-haiku-4.5
Response (Native Anthropic format):
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"content": [{"type": "text", "text": "Nice to meet you, Alice!"}],
"model": "claude-sonnet-5",
"stop_reason": "end_turn",
"usage": {"input_tokens": 12, "output_tokens": 15}
}Google Gemini Native Endpoint
POST /v1/models/:model:generateContent
Native Google format. Use Google's AI SDK directly.
curl:
curl -X POST "https://api.memoryrouter.ai/v1/models/gemini-2.5-pro:generateContent" \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx" \
-H "Content-Type: application/json" \
-d '{
"contents": [{"role": "user", "parts": [{"text": "My name is Alice"}]}],
"generationConfig": {"maxOutputTokens": 1024}
}'Python (Google AI SDK):
import google.generativeai as genai
# Configure with MemoryRouter endpoint
genai.configure(
api_key="mk_xxxxxxxxxxxxxxxx",
transport="rest",
client_options={"api_endpoint": "api.memoryrouter.ai"}
)
model = genai.GenerativeModel("gemini-2.5-pro")
response = model.generate_content("My name is Alice")
print(response.text)Streaming:
curl -X POST "https://api.memoryrouter.ai/v1/models/gemini-2.5-pro:streamGenerateContent" \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx" \
-H "Content-Type: application/json" \
-d '{
"contents": [{"role": "user", "parts": [{"text": "Tell me a story"}]}]
}'Provider-Native Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
contents | array | Yes | Array of content objects |
systemInstruction | object | No | System instruction |
generationConfig.temperature | number | No | Sampling temperature |
generationConfig.topP | number | No | Nucleus sampling |
generationConfig.topK | number | No | Top-k sampling |
generationConfig.maxOutputTokens | number | No | Max tokens |
generationConfig.stopSequences | array | No | Stop sequences |
safetySettings | array | No | Safety settings |
Response (Native Google format):
{
"candidates": [{
"content": {
"parts": [{"text": "Nice to meet you, Alice!"}],
"role": "model"
},
"finishReason": "STOP"
}],
"usageMetadata": {"promptTokenCount": 12, "candidatesTokenCount": 15}
}Memory Control
Control memory behavior per-request using headers, query parameters, or body parameters. All are stripped before forwarding to the provider.
Headers
| Header | Values | Default | Description |
|---|---|---|---|
X-Memory-Mode | on, off, read, write | on | Memory operation mode |
X-Memory-Store | true, false | true | Store user input |
X-Memory-Store-Response | true, false | true | Store assistant response |
X-Session-ID | string | — | Group conversations into sessions |
X-Memory-Key | mk_xxx | — | Alternative auth (use with Authorization: Bearer <provider-key>) |
Query Parameters (shortcuts)
| Parameter | Example | Effect |
|---|---|---|
?memory=off | /v1/chat/completions?memory=off | Disable memory entirely |
?mode=read | /v1/chat/completions?mode=read | Read-only (don't store this exchange) |
?store=false | /v1/chat/completions?store=false | Don't store user input |
Body Parameters
Include in request body—they're stripped before forwarding:
{
"model": "openai/gpt-5.1",
"messages": [...],
"memory": false,
"memory_mode": "read",
"memory_store": false,
"memory_store_response": false,
"session_id": "user-123-chat-456"
}| Parameter | Type | Description |
|---|---|---|
memory | boolean | false disables memory |
memory_mode | string | on, off, read, write |
memory_store | boolean | Store user input |
memory_store_response | boolean | Store assistant response |
session_id | string | Session identifier |
Memory Modes Explained
| Mode | Retrieve | Store | Use Case |
|---|---|---|---|
on | ✅ | ✅ | Normal operation (default) |
read | ✅ | ❌ | Use memory but don't add to it (testing, one-off queries) |
write | ❌ | ✅ | Store without retrieval (bulk import, backfill) |
off | ❌ | ❌ | Stateless request (no memory at all) |
Per-Message Memory Control
Exclude specific messages from storage:
{
"messages": [
{"role": "user", "content": "Remember this", "memory": true},
{"role": "user", "content": "Don't remember this", "memory": false}
]
}Session Management
Sessions group related conversations. Memory is scoped to sessions when X-Session-ID is provided.
With header:
curl -X POST https://api.memoryrouter.ai/v1/chat/completions \
-H "Authorization: Bearer mk_xxx" \
-H "X-Session-ID: user-123-project-456" \
-d '{"model": "openai/gpt-4o", "messages": [...]}'With body parameter:
{
"model": "openai/gpt-5.1",
"messages": [...],
"session_id": "user-123-project-456"
}How it works:
- Each session gets its own memory space
- Core memory (no session) stores persistent, cross-session context
- Session memory is recalled alongside core memory
- Clear a session without affecting core:
DELETE /v1/memorywithX-Session-ID
Response Headers
MemoryRouter adds timing headers to every response:
| Header | Description |
|---|---|
X-MR-Processing-Ms | Total MemoryRouter processing time |
X-Provider-Response-Ms | Time waiting for AI provider |
X-Total-Ms | End-to-end request time |
X-Session-ID | Echo of session ID (if provided) |
Memory Management
GET /v1/memory/stats
Get memory statistics for your key.
curl https://api.memoryrouter.ai/v1/memory/stats \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx"Response:
{
"key": "mk_xxxxxxxxxxxxxxxx",
"memories": 1247,
"total_tokens": 89432,
"oldest": "2024-01-15T10:30:00Z",
"newest": "2024-02-04T15:02:00Z"
}DELETE /v1/memory
Clear all memory for your key.
# Clear all memory
curl -X DELETE https://api.memoryrouter.ai/v1/memory \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx"
# Clear only a specific session
curl -X DELETE https://api.memoryrouter.ai/v1/memory \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx" \
-H "X-Session-ID: session-123"
# Full reset (complete memory wipe)
curl -X DELETE "https://api.memoryrouter.ai/v1/memory?reset=true" \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx"POST /v1/memory/warmup
Pre-load memory for faster first request. Useful after cold starts.
curl -X POST https://api.memoryrouter.ai/v1/memory/warmup \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx"With session:
curl -X POST https://api.memoryrouter.ai/v1/memory/warmup \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx" \
-H "X-Session-ID: session-123"Response:
{
"status": "warm",
"key": "mk_xxxxxxxxxxxxxxxx",
"memories_loaded": 1336,
"warmup_ms": 45
}POST /v1/memory/upload
Bulk import memories from JSONL format.
Requirements: Payment method on file
curl -X POST https://api.memoryrouter.ai/v1/memory/upload \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx" \
-H "Content-Type: text/plain" \
--data-binary @memories.jsonlJSONL Format (one JSON object per line):
{"content": "User prefers dark mode", "role": "user", "timestamp": 1706900000000}
{"content": "The meeting is scheduled for Friday at 3pm"}
{"content": "Customer is interested in enterprise plan", "role": "assistant"}| Field | Type | Required | Default | Description |
|---|---|---|---|---|
content | string | Yes | — | The memory text to store |
role | string | No | user | user, assistant, or system |
timestamp | number | No | now | Unix timestamp in milliseconds |
Response:
{
"status": "complete",
"memoryKey": "mk_xxxxxxxxxxxxxxxx",
"vault": "core",
"stats": {
"total": 150,
"processed": 150,
"failed": 0
},
"message": "Successfully stored 150 memories"
}Limits:
- Maximum 10,000 lines per upload
- Split larger files into batches
Pass-Through Endpoints
These forward directly to providers without memory processing:
| Endpoint | Provider | Description |
|---|---|---|
POST /v1/audio/transcriptions | OpenAI | Whisper transcription |
POST /v1/audio/speech | OpenAI | Text-to-speech |
POST /v1/images/generations | OpenAI | DALL-E image generation |
POST /v1/embeddings | OpenAI | Text embeddings |
curl -X POST https://api.memoryrouter.ai/v1/audio/transcriptions \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx" \
-F "file=@audio.mp3" \
-F "model=whisper-1"Note: The
/v1/embeddingsendpoint passes through to OpenAI. MemoryRouter handles memory processing internally.
GET /v1/models
List available models based on your configured provider API keys. Returns 90+ models across 4 providers.
curl https://api.memoryrouter.ai/v1/models \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx"Response:
{
"providers": [
{
"provider": "OpenAI",
"models": ["openai/gpt-5.2", "openai/gpt-5.1", "openai/gpt-4.5", "openai/o3", "openai/o1-pro", "openai/o1", "..."]
},
{
"provider": "Anthropic",
"models": ["anthropic/claude-opus-4.5", "anthropic/claude-sonnet-5", "anthropic/claude-haiku-4.5", "..."]
},
{
"provider": "Google",
"models": ["google/gemini-3-pro", "google/gemini-3-flash", "google/gemini-2.5-pro", "google/gemini-2.5-flash", "..."]
},
{
"provider": "Meta",
"models": ["meta/llama-4-maverick", "meta/llama-4-scout", "meta/llama-3.3-70b", "..."]
},
{
"provider": "Mistral",
"models": ["mistral/mistral-large-3", "mistral/ministral-3-14b", "mistral/mistral-small-3.2", "..."]
},
{
"provider": "xAI",
"models": ["x-ai/grok-4", "x-ai/grok-4-fast", "x-ai/grok-3", "x-ai/grok-3-mini", "..."]
}
],
"models": ["openai/gpt-5.1", "anthropic/claude-sonnet-5", "google/gemini-3-flash", "meta/llama-4-maverick", "..."],
"default": "openai/gpt-5.1",
"catalog_updated": "2026-02-04T19:28:00Z"
}Supported Providers & Popular Models:
| Provider | Popular Models |
|---|---|
| OpenAI | gpt-5.2, gpt-5.1, gpt-4.5, o3, o1-pro, o1 |
| Anthropic | claude-opus-4.5, claude-sonnet-5, claude-haiku-4.5 |
gemini-3-pro, gemini-3-flash, gemini-2.5-pro, gemini-2.5-flash | |
| Meta | llama-4-maverick, llama-4-scout, llama-3.3-70b |
| Mistral | mistral-large-3, ministral-3-14b, mistral-small-3.2 |
| xAI | grok-4, grok-4-fast, grok-3, grok-3-mini |
Use the full model name with provider prefix (e.g., openai/gpt-5.1, anthropic/claude-sonnet-5).
Account Usage
GET /v1/account/usage
Get token usage for your memory key.
curl "https://api.memoryrouter.ai/v1/account/usage?start=2024-01-01&end=2024-02-01" \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx"Response:
{
"key": "mk_xxxxxxxxxxxxxxxx",
"period": {"start": "2024-01-01", "end": "2024-02-01"},
"total_requests": 1547,
"total_input_tokens": 234567,
"total_output_tokens": 189234,
"total_memory_tokens": 89432,
"by_model": {
"openai/gpt-5.1": {"requests": 1200, "input_tokens": 180000},
"anthropic/claude-sonnet-5": {"requests": 347, "input_tokens": 54567}
}
}Semantic-Temporal Memory
MemoryRouter uses a semantic-temporal architecture — memories are indexed by meaning and time. Recent context is weighted higher, but important facts persist.
The system automatically balances:
- Immediate context — Current conversation
- Recent history — Last few days of interactions
- Long-term memory — Persistent facts and preferences
No configuration required. The right memories surface at the right time.
Health Check
GET /health
No authentication required.
curl https://api.memoryrouter.ai/healthResponse:
{
"status": "healthy",
"timestamp": "2024-02-03T14:22:00Z"
}Error Codes
| Code | Meaning | Common Causes |
|---|---|---|
| 400 | Bad Request | Missing required fields, invalid JSON |
| 401 | Unauthorized | Invalid memory key, missing provider API key |
| 402 | Payment Required | No card on file (for upload endpoint) |
| 413 | Payload Too Large | Upload exceeds 10,000 lines |
| 429 | Rate Limited | Too many requests |
| 500 | Internal Error | Server-side issue |
| 502 | Provider Error | Upstream AI provider failed |
Error Response Format:
{
"error": "No API key configured for provider: anthropic",
"hint": "Add your anthropic API key in your account settings, or pass X-Provider-Key header"
}Pass-Through Authentication
For advanced integrations, you can pass your provider API key directly instead of storing it:
curl -X POST https://api.memoryrouter.ai/v1/chat/completions \
-H "X-Memory-Key: mk_xxxxxxxxxxxxxxxx" \
-H "Authorization: Bearer sk-your-openai-key" \
-H "Content-Type: application/json" \
-d '{"model": "openai/gpt-5.1", "messages": [...]}'Or use X-Provider-Key:
curl -X POST https://api.memoryrouter.ai/v1/chat/completions \
-H "Authorization: Bearer mk_xxxxxxxxxxxxxxxx" \
-H "X-Provider-Key: sk-your-openai-key" \
-d '...'This enables zero-configuration integrations where your existing code already has provider keys.