AI Generation
Generate AI Answer
Generate an answer grounded in memory by querying the agent’s context and synthesizing an LLM response.
POST
Overview
Answer questions using stored agent memories. This operation retrieves relevant context from the agent’s Moorcheh namespace (scoped by the active session) and calls Moorcheh answer generation to produce a grounded reply.Authentication
API clients do not send an API key orAuthorization header.
Session token from Activate Agent. Must match
agent_id.Must be
application/jsonPath Parameters
The unique identifier of the agent.
Body
The question to answer using retrieved memories as context.
Maximum memories to use as context (
top_k). Range 1–100. If omitted, the server default applies (see deployment configuration).LLM temperature,
0.0–2.0. If omitted, the server default applies.Model identifier for answer generation (snake_case field name:
ai_model). If omitted, the server default applies.When
true, relevance filtering uses a confidence threshold. When false (default), threshold is ignored and not sent to Moorcheh.Confidence threshold (
0.0–1.0). Only used when kiosk_mode is true. If kiosk_mode is true and threshold is omitted, the server uses 0.15.Temperature Guide
- 0.0-0.5: Conservative, factual responses - best for technical documentation
- 0.5-1.0: Balanced creativity - good for general Q&A
- 1.0-2.0: More creative and varied responses - use carefully for factual content
Available Models
| Model ID | Name | Provider | Description | Credits |
|---|---|---|---|---|
| anthropic.claude-sonnet-4-6 | Claude Sonnet 4.6 | Anthropic | Fast flagship: coding, tools, long docs and RAG (~1M context). | 3 |
| anthropic.claude-opus-4-6-v1 | Claude Opus 4.6 | Anthropic | Deepest reasoning and hardest tasks; pick when quality matters most (~1M context). | 3 |
| meta.llama4-maverick-17b-instruct-v1:0 | Llama 4 Maverick 17B | Meta | Long context, summarization, function calling, fine-tuning friendly. | 3 |
| amazon.nova-pro-v1:0 | Amazon Nova Pro | Amazon | Chat, math, and structured answers for AWS-style workloads. | 2 |
| deepseek.r1-v1:0 | DeepSeek R1 | DeepSeek | Step-by-step reasoning; math, logic, and technical explanations. | 1 |
| deepseek.v3.2 | DeepSeek V3.2 | DeepSeek | Efficient general Q&A, multilingual, everyday RAG (~164K context). | 2 |
| openai.gpt-oss-120b-1:0 | OpenAI GPT OSS 120B | OpenAI | Large generalist: research-style answers and long-form writing. | 3 |
| qwen.qwen3-32b-v1:0 | Qwen 3 32B | Qwen | Code and bilingual (EN/ZH) tasks in a smaller footprint. | 2 |
| qwen.qwen3-next-80b-a3b | Qwen3 Next 80B A3B | Qwen | MoE model for long chats, docs, and code at scale (~256K context). | 1 |