LlamaIndex + MEMANTO
How It Works
FunctionTool instances (remember, recall, answer) that your LlamaIndex agent can call during reasoning. The agent decides when to store something, when to search raw memories, and when to get a synthesized answer directly from memory.
Prerequisites
- Python 3.8+
- Moorcheh API key
- MEMANTO server running locally
Install
Step 1: Start MEMANTO Server
Step 2: Create the Memory Tools
Creatememanto_tools.py:
Step 3: Build the Agent
Createagent.py:
Step 4: Run
Getting Synthesized Answers from Memory
Theanswer_tool calls MEMANTO’s built-in RAG — it synthesizes a direct response from stored memories using MEMANTO’s native model. No extra LLM token usage on your side.
When to useanswer_toolvsrecall_tool
- Use
recall_toolwhen the agent needs to reason over multiple raw memory items.- Use
answer_toolwhen the agent (or user) needs a clean, direct response from memory.