On-Prem Requirements

The on-prem stack runs Memanto plus the Moorcheh server (and optionally Ollama) in Docker on a single host. Everything is automated by the memanto CLI — you only need to make sure the prerequisites are in place before you start.

Operating Systems

The on-prem stack is supported on:

Windows 10/11 with Docker Desktop (WSL2 backend)
macOS 12+ (Apple Silicon and Intel) with Docker Desktop
Linux (Ubuntu 20.04+, Debian 11+, RHEL 8+, Amazon Linux 2) with Docker Engine 20.10+

Hardware

Component	Minimum	Recommended	Notes
CPU	4 cores	8+ cores	More cores noticeably speed up Ollama inference.
RAM	8 GB	16 GB+	Ollama with `qwen2.5` needs ~6 GB resident; embeddings add ~1 GB. With OpenAI/Cohere as providers, 4–6 GB total is enough.
Disk	10 GB free	30 GB+ free	Ollama model images are 1–7 GB each. Moorcheh storage grows with your memory volume.
GPU	Not required	NVIDIA GPU with 8 GB+ VRAM	Optional — speeds up Ollama. CPU-only inference works on any modern machine.

For air-gapped deployments using OpenAI or Cohere as the embedding/LLM provider, the hardware footprint is much smaller (no Ollama container needed).

Software

Required

Docker Engine 20.10+ or Docker Desktop 4.0+ with the daemon running.
- The Memanto onboarding wizard fails fast with a clear error if docker info does not succeed.
- Verify with:
  docker --version docker info
Python 3.10+ for the Memanto CLI itself.
- Verify with:
  python --version
memanto Python package.
pip install memanto
moorcheh-client>=0.1.3 — the Python package that ships the moorcheh up command and exposes the on-prem SDK shape Memanto talks to. The onboarding wizard installs this automatically the first time you choose On-Prem at the prompt; you can also install it explicitly:
pip install "moorcheh-client>=0.1.3"

Optional

uvicorn[standard] if you plan to run memanto serve directly. Installed automatically as a dependency of memanto in most cases.
NVIDIA Container Toolkit if you want Ollama to use a GPU inside Docker.

Network & Ports

Port	Bound to	Used by
8000	`localhost` (default)	Memanto’s REST API (`memanto serve`, `memanto ui`).
8080	`localhost` (default)	Moorcheh on-prem server, started by `moorcheh up`.
11434	inside the Docker network	Ollama, when used as the embedding/LLM provider. Started as a sibling container by `moorcheh up`.

You do not need any inbound internet access for the runtime path. Internet is required only:

Once, to pip install memanto and moorcheh-client.
Once per Ollama model, to pull the image from the Ollama registry.
For every answer.generate call if your LLM provider is OpenAI or Cohere.

Provider Choices

You will be prompted to choose providers during onboarding. The choices and what they imply:

Provider	Embedding	LLM (Answer)	API key	Cost	Best for
Ollama	`nomic-embed-text`	`qwen2.5`	None	$0	True air-gap; local development; demos; cost-sensitive workloads.
OpenAI	`text-embedding-3-small`	`gpt-4o-mini`	Required	Per-token	High-quality embeddings; existing OpenAI relationships.
Cohere	`embed-english-v3.0`	`command-r-plus-08-2024`	Required	Per-token	High-quality long-context answers; multilingual embeddings.

You can mix providers — e.g., Ollama embeddings with an OpenAI LLM for answers. The onboarding wizard prompts for each independently.

Disk Layout

Once onboarding finishes, the on-prem stack uses these locations on the host:

Path	Owner	Purpose
`~/.memanto/`	Memanto CLI	Top-level config dir. The cloud backend stores everything here.
`~/.memanto/on-prem/`	Memanto CLI (on-prem only)	Isolated data dir: agents, sessions, registry, and `state.json` for the on-prem backend.
`~/.memanto/on-prem/state.json`	Memanto CLI	Source of truth for `url`, `embedding_provider`, `embedding_model`, `llm_provider`, `llm_model`.
`~/.moorcheh/config.json`	`moorcheh-client`	Embedding and LLM provider config consumed by the on-prem server.
`~/.moorcheh/uploads/`	`moorcheh-client`	Staging area for files uploaded via `memanto upload` — paths inside this dir are mapped into the container.

Cloud and on-prem state are deliberately kept in separate directories so you can switch backends without one polluting the other.

Verifying Prerequisites

Before running the on-prem setup, the wizard performs these checks for you:

docker is on PATH.
docker info returns successfully (daemon is up).
moorcheh-client>=0.1.3 is importable (installs it if not).
Provider API keys (if you chose OpenAI or Cohere) are non-empty.

If any check fails, the wizard prints a one-line error with a hint and exits with a non-zero status — no partial state is left behind.

Next Step

→ On-Prem Quickstart — install the stack end-to-end in 5–10 minutes.

Introduction

Configuration

Deployment

Requirements

On-Prem Requirements

Operating Systems

Hardware

Software

Required

Optional

Network & Ports

Provider Choices

Disk Layout

Verifying Prerequisites

Next Step

​On-Prem Requirements

​Operating Systems

​Hardware

​Software

​Required

​Optional

​Network & Ports

​Provider Choices

​Disk Layout

​Verifying Prerequisites

​Next Step

On-Prem Requirements

Operating Systems

Hardware

Software

Required

Optional

Network & Ports

Provider Choices

Disk Layout

Verifying Prerequisites

Next Step