Generate AI Answer - MEMANTO Docs

curl -X POST "http://localhost:8000/api/v2/agents/my-agent/answer" \
  -H "X-Session-Token: your_session_token" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "How should we contact the user?",
    "limit": 5,
    "temperature": 0.7,
    "ai_model": "anthropic.claude-sonnet-4-6",
    "kiosk_mode": false
  }'

{
  "agent_id": "my-agent",
  "session_id": "sess_123abc",
  "question": "How should we contact the user?",
  "answer": "Based on stored preferences, the user prefers email communication. Send email during business hours.",
  "sources": [
    {
      "id": "3e681f12-a28c-4d1d-9632-b8dadf1f9d0c",
      "score": 0.86
    }
  ],
  "namespace": "memanto_agent_my-agent"
}

POST

api

agents

{agent_id}

answer

curl -X POST "http://localhost:8000/api/v2/agents/my-agent/answer" \
  -H "X-Session-Token: your_session_token" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "How should we contact the user?",
    "limit": 5,
    "temperature": 0.7,
    "ai_model": "anthropic.claude-sonnet-4-6",
    "kiosk_mode": false
  }'

{
  "agent_id": "my-agent",
  "session_id": "sess_123abc",
  "question": "How should we contact the user?",
  "answer": "Based on stored preferences, the user prefers email communication. Send email during business hours.",
  "sources": [
    {
      "id": "3e681f12-a28c-4d1d-9632-b8dadf1f9d0c",
      "score": 0.86
    }
  ],
  "namespace": "memanto_agent_my-agent"
}

Overview

Answer questions using stored agent memories. This operation retrieves relevant context from the agent’s Moorcheh namespace (scoped by the active session) and calls Moorcheh answer generation to produce a grounded reply.

Authentication

API clients do not send an API key or Authorization header.

X-Session-Token

string

required

Session token from Activate Agent. Must match agent_id.

Content-Type

string

required

Must be application/json

Path Parameters

agent_id

string

required

The unique identifier of the agent.

Body

question

string

required

The question to answer using retrieved memories as context.

limit

integer

Maximum memories to use as context (top_k). Range 1–100. If omitted, the server default applies (see deployment configuration).

temperature

number

LLM temperature, 0.0–2.0. If omitted, the server default applies.

ai_model

string

Model identifier for answer generation (snake_case field name: ai_model). If omitted, the server default applies.

kiosk_mode

boolean

When true, relevance filtering uses a confidence threshold. When false (default), threshold is ignored and not sent to Moorcheh.

threshold

number

Confidence threshold (0.0–1.0). Only used when kiosk_mode is true. If kiosk_mode is true and threshold is omitted, the server uses 0.15.

curl -X POST "http://localhost:8000/api/v2/agents/my-agent/answer" \
  -H "X-Session-Token: your_session_token" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "How should we contact the user?",
    "limit": 5,
    "temperature": 0.7,
    "ai_model": "anthropic.claude-sonnet-4-6",
    "kiosk_mode": false
  }'

{
  "agent_id": "my-agent",
  "session_id": "sess_123abc",
  "question": "How should we contact the user?",
  "answer": "Based on stored preferences, the user prefers email communication. Send email during business hours.",
  "sources": [
    {
      "id": "3e681f12-a28c-4d1d-9632-b8dadf1f9d0c",
      "score": 0.86
    }
  ],
  "namespace": "memanto_agent_my-agent"
}

Temperature Guide

0.0-0.5: Conservative, factual responses - best for technical documentation
0.5-1.0: Balanced creativity - good for general Q&A
1.0-2.0: More creative and varied responses - use carefully for factual content

Available Models

Model ID	Name	Provider	Description	Credits
anthropic.claude-sonnet-4-6	Claude Sonnet 4.6	Anthropic	Fast flagship: coding, tools, long docs and RAG (~1M context).	3
anthropic.claude-opus-4-6-v1	Claude Opus 4.6	Anthropic	Deepest reasoning and hardest tasks; pick when quality matters most (~1M context).	3
meta.llama4-maverick-17b-instruct-v1:0	Llama 4 Maverick 17B	Meta	Long context, summarization, function calling, fine-tuning friendly.	3
amazon.nova-pro-v1:0	Amazon Nova Pro	Amazon	Chat, math, and structured answers for AWS-style workloads.	2
deepseek.r1-v1:0	DeepSeek R1	DeepSeek	Step-by-step reasoning; math, logic, and technical explanations.	1
deepseek.v3.2	DeepSeek V3.2	DeepSeek	Efficient general Q&A, multilingual, everyday RAG (~164K context).	2
openai.gpt-oss-120b-1:0	OpenAI GPT OSS 120B	OpenAI	Large generalist: research-style answers and long-form writing.	3
qwen.qwen3-32b-v1:0	Qwen 3 32B	Qwen	Code and bilingual (EN/ZH) tasks in a smaller footprint.	2
qwen.qwen3-next-80b-a3b	Qwen3 Next 80B A3B	Qwen	MoE model for long chats, docs, and code at scale (~256K context).	1

Next Steps

Recall to fetch raw memories without generating an AI answer
Remember to add more context to the agent

Recall Recent

​Overview

​Authentication

​Path Parameters

​Body

​Temperature Guide

​Available Models

​Next Steps

Overview

Authentication

Path Parameters

Body

Temperature Guide

Available Models

Next Steps