reflex-server-0.2.1 has been yanked.
reflex-server
reflex-server owns the HTTP gateway (Axum) and builds the reflex binary.
It depends on the core library crate (reflex-cache) for the cache tiers, storage, embeddings, and vector DB integration.
Run (Dev)
# Requires Qdrant running (gRPC on 6334)
RUST_LOG=debug
Run (Release)
Docker Compose
From repo root:
Endpoints
GET /healthzGET /readyPOST /v1/chat/completions(OpenAI-compatible)
Response Status
Responses include X-Reflex-Status:
hit-l1-exact: exact request matchhit-l3-verified: semantic hit verified by L3miss: forwarded to provider and stored
The response choices[].message.content is Tauq-encoded.
Configuration
Most commonly used env vars:
| Variable | Default | Notes |
|---|---|---|
REFLEX_PORT |
8080 |
HTTP port |
REFLEX_BIND_ADDR |
127.0.0.1 |
Bind address |
REFLEX_QDRANT_URL |
http://localhost:6334 |
Qdrant gRPC |
REFLEX_STORAGE_PATH |
./.data |
Storage base path |
REFLEX_L1_CAPACITY |
10000 |
L1 capacity |
REFLEX_MODEL_PATH |
(unset) | Unset = stub embedder |
REFLEX_RERANKER_PATH |
(unset) | Optional reranker |
REFLEX_RERANKER_THRESHOLD |
0.70 |
L3 threshold |
REFLEX_MOCK_PROVIDER |
(unset) | Set to bypass real provider calls |
Point Your Agent
Python (openai SDK)
=
=
GPU Features
Build/run with one of:
--features metal(Apple Silicon)--features cuda(NVIDIA)--features cpu(docs.rs-style CPU flag; usually you can omit)
Example:
Tests
Real integration tests (needs Qdrant):
REFLEX_QDRANT_URL=http://localhost:6334