Skip to main content

CrewAI + MEMANTO

CrewAI Give your CrewAI agents persistent, cross-session memory powered by MEMANTO. By default, CrewAI agents lose context when a crew run ends. With MEMANTO, agents can store facts, decisions, and preferences — and recall them in future runs.

How It Works

CrewAI Agent → MEMANTO Tools (remember / recall / answer) → MEMANTO Server → Moorcheh.ai
Each agent in your crew gets access to three tools: one to store memories, one to search them, and one to get a synthesized answer directly from memory. MEMANTO handles the semantic layer — no vector DB setup required.

Prerequisites

Install

pip install memanto crewai crewai-tools httpx

Step 1: Start MEMANTO Server

memanto serve

Step 2: Create the Memory Tools

Create memanto_tools.py:
import os
import httpx
from crewai.tools import tool

MEMANTO_URL = "http://localhost:8000"
API_KEY = os.environ["MOORCHEH_API_KEY"]
AGENT_ID = "crewai-agent"

# Call once at startup — reuse the session token across your crew run
def activate_session() -> str:
    response = httpx.post(
        f"{MEMANTO_URL}/api/v2/agents/{AGENT_ID}/activate",
        headers={"Authorization": f"Bearer {API_KEY}"}
    )
    response.raise_for_status()
    return response.json()["session_token"]

SESSION_TOKEN = activate_session()

HEADERS = {
    "Authorization": f"Bearer {API_KEY}",
    "X-Session-Token": SESSION_TOKEN
}

@tool("Remember")
def remember(content: str) -> str:
    """Store a piece of information in long-term memory."""
    response = httpx.post(
        f"{MEMANTO_URL}/api/v2/agents/{AGENT_ID}/remember",
        params={"memory_type": "fact", "content": content},
        headers=HEADERS
    )
    response.raise_for_status()
    return f"Memory stored: {response.json()['memory_id']}"

@tool("Recall")
def recall(query: str) -> str:
    """Search long-term memory for relevant information."""
    response = httpx.get(
        f"{MEMANTO_URL}/api/v2/agents/{AGENT_ID}/recall",
        params={"query": query, "limit": 5},
        headers=HEADERS
    )
    response.raise_for_status()
    memories = response.json().get("memories", [])
    if not memories:
        return "No relevant memories found."
    return "\n".join(f"- {m['content']}" for m in memories)

Step 3: Build Your Crew

Create crew.py:
from crewai import Agent, Task, Crew
from memanto_tools import remember, recall

# Research agent stores findings in memory
researcher = Agent(
    role="Research Analyst",
    goal="Gather and store key facts about the topic",
    backstory="You are thorough and always save important findings for future reference.",
    tools=[remember, recall],
    verbose=True
)

# Writer agent recalls stored findings to produce output
writer = Agent(
    role="Content Writer",
    goal="Write a report using previously stored research",
    backstory="You rely on stored research to write accurate, grounded content.",
    tools=[recall],
    verbose=True
)

research_task = Task(
    description="Research the latest trends in AI agents and store 5 key findings.",
    expected_output="Confirmation that 5 findings have been stored in memory.",
    agent=researcher
)

write_task = Task(
    description="Recall the stored AI agent findings and write a concise summary report.",
    expected_output="A 3-paragraph summary report grounded in recalled memory.",
    agent=writer
)

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task],
    verbose=True
)

result = crew.kickoff()
print(result)

Step 4: Run

export MOORCHEH_API_KEY=mk_your_api_key
python crew.py

Getting AI Answers from Memory

The answer tool uses MEMANTO’s built-in RAG to synthesize a response directly from stored memories — no extra LLM call needed. This is useful when an agent needs a clean, readable answer rather than a raw list of memory items. Add it to memanto_tools.py alongside remember and recall:
@tool("Answer")
def answer(question: str) -> str:
    """Get a synthesized answer from long-term memory using MEMANTO's built-in RAG.
    Use this when you need a direct, grounded answer rather than a raw list of memories.
    No external LLM call is made — MEMANTO answers using its native model.
    """
    response = httpx.post(
        f"{MEMANTO_URL}/api/v2/agents/{AGENT_ID}/answer",
        params={"question": question},
        headers=HEADERS
    )
    response.raise_for_status()
    return response.json().get("answer", "No answer found.")
Then give it to agents that need synthesized responses:
from crewai import Agent, Task, Crew
from memanto_tools import remember, recall, answer

analyst = Agent(
    role="Customer Analyst",
    goal="Answer questions about customers using stored memory",
    backstory="You have access to long-term memory about every customer.",
    tools=[remember, recall, answer],
    verbose=True
)

task = Task(
    description="What communication preferences does Alice have, and when is the best time to reach her?",
    expected_output="A clear, grounded answer based on stored customer memory.",
    agent=analyst
)

crew = Crew(agents=[analyst], tasks=[task], verbose=True)
result = crew.kickoff()
print(result)
When to use answer vs recall
  • Use recall when the agent needs raw memory items to reason over.
  • Use answer when the agent needs a ready-to-use response, such as for a final task output or a direct reply to a user.

Using Memory Types

Store richer context by specifying a memory type:
@tool("RememberWithType")
def remember_typed(content: str, memory_type: str = "fact") -> str:
    """Store information with a specific memory type.
    Types: fact, preference, decision, goal, commitment, event, error
    """
    response = httpx.post(
        f"{MEMANTO_URL}/api/v2/agents/{AGENT_ID}/remember",
        params={"memory_type": memory_type, "content": content},
        headers=HEADERS
    )
    response.raise_for_status()
    return f"Stored as {memory_type}: {response.json()['memory_id']}"

Persistent Memory Across Runs

Because memories live in MEMANTO (not in-process), they persist between separate crew runs:
# Run 1: researcher stores findings
crew.kickoff()

# Run 2 (next day): writer recalls those same findings
crew.kickoff()
No extra configuration needed — the agent ID ties memories together across runs.

Next Steps