Self-Hosting Memanto Server
memanto serve runs the same FastAPI server that powers the REST API. For development, running it from the CLI is enough — but for shared environments you’ll want it under a process manager that survives logouts and restarts.
This page covers Docker, Docker Compose, systemd, and a manual long-running process. All four options work for both backends — set MEMANTO_BACKEND=on-prem to talk to your local Moorcheh server, or leave it cloud (default) for Moorcheh Cloud.
Image: What Memanto Ships
The Memanto repository includes a production-readyDockerfile:
- Base:
python:3.12-slim - Runs as a non-root user (
memanto, UID1001) - Exposes port
8000 - Builds dependencies via
uvfor fast, deterministic installs - Built-in
HEALTHCHECKpolling/ready(a lightweight endpoint that does not call Moorcheh) - Entry point:
uvicorn memanto.app.main:app --host 0.0.0.0 --port 8000
Option 1: Docker
Cloud backend
On-prem backend
Memanto’s container needs to reach the Moorcheh on-prem container running on the same host. On Linux/macOS, usehost.docker.internal; on Linux without Docker Desktop, use --network host instead.
Option 2: Docker Compose
The Memanto repo ships adocker-compose.yml for the cloud backend. Drop in an .env file and you’re done.
Cloud backend
On-prem backend
Extend the compose file to add the Moorcheh server and (optionally) Ollama:Option 3: systemd (Linux)
For a single-host install without Docker, run Memanto under systemd. Save as/etc/systemd/system/memanto.service:
moorcheh up) should be managed by a separate systemd unit or by Docker’s own --restart unless-stopped so it comes back automatically.
Option 4: Manual / Background
For one-off testing on a remote host:pkill -f "memanto serve". Not recommended for production — use systemd or Docker.
Endpoints to Probe
All deployment modes expose the same operational endpoints:| Endpoint | Purpose | Notes |
|---|---|---|
GET /health | Full health, including Moorcheh connectivity. | Returns 200 only when Moorcheh is reachable; use for readiness gating before sending traffic. |
GET /ready | Lightweight check; always 200 once the process is up. | Use for liveness probes — does not depend on Moorcheh. |
GET /live | Same as /ready. | Kept for Kubernetes idiom. |
GET /docs | Swagger UI for the REST API. | |
GET /redoc | ReDoc rendering of the OpenAPI spec. | |
GET /ui | Web dashboard. | Available when running memanto ui or with MEMANTO_UI_MODE=true. |
Performance & Concurrency
For more than a handful of concurrent agents, run Memanto with multipleuvicorn workers behind a reverse proxy:
- 2 workers per CPU core (Memanto is I/O-bound).
- Reverse proxy (Nginx, Caddy, or Traefik) terminating TLS and forwarding to
127.0.0.1:8000. - Rate limits at the proxy if exposing publicly.
CORS
By default,ALLOWED_ORIGINS=*. Restrict in production:
Logs
Structured JSON logging is enabled by default. Memory operations are logged with content redaction so payloads never end up in your log aggregator.LOG_LEVEL=DEBUG for detailed request/response traces during troubleshooting.
Next Steps
- Kubernetes Deployment — manifests for a clustered install.
- Security & Operations — TLS, secrets management, hardening.
- Troubleshooting — common deployment failure modes.