Two helpers built during the beast deployment to migrate the legacy
Neo4j knowledge graph (decommissioned in the v3 cutover) into mem0 v2
as natural-language memories.
scripts/import_neo4j_to_mem0.py
- Connects to Neo4j via Bolt, iterates per-user relationships,
POSTs each as a /memories request.
- Two modes:
raw: "humanize(src) verb humanize(dest)." (snake_case → spaces)
--llm-rewrite: minimax-m2 via OpenAI-compat proxy rewrites each
tuple into a grammatical English sentence; the LLM
may also output SKIP for non-meaningful tuples
(postal codes, timezone offsets, self-refs).
- Tags every imported memory with metadata.source="neo4j_legacy_import"
plus neo4j_rel_type + import_timestamp for traceability/cleanup.
- Caches LLM rewrites by (source, rel, dest, user_id).
scripts/cleanup_neo4j_imports.py
- Finds and DELETEs all memories with source="neo4j_legacy_import"
for given users, via the /memories DELETE endpoint (per-user API
key, so the deletes go through mem0's normal auth + cleanup path).
Run on beast (2026-05-23): 2007 Neo4j edges → 615 net new memories in
mem0_v3 (30.6% yield after LLM SKIPs + mem0 fact-extraction dedup).
mem0 v3's fact extractor correctly deduplicated edges that restated
facts already in vector memory (e.g., manju's 9 existing memories
absorbed all 17 of her Neo4j edges).
Backend startup needs ~30-60s (spaCy NLP models load, mem0 v2 init,
MCP session manager, 4 workers). The Dockerfile's 5s start-period was
too short, causing willfarrell/autoheal (running on the host with
AUTOHEAL_CONTAINER_LABEL=all) to kill the container before it finished
booting. Overriding the healthcheck in compose with a longer start_period
keeps failures from counting until the app is actually ready.
Existing prod storage was written by a newer Qdrant (qdrant/qdrant:latest
resolved to 1.18.1 on 2026-05-23). Pinning to 1.12.4 caused a shard-holder
deserialization panic on startup because Qdrant's storage format is
forward-compatible (newer reads older) but not backward.
The custom OpenAI-compatible endpoint (LiteLLM) serves the same
qwen3-embedding model and is reachable from the container in all
deployments; direct Ollama may not be. Vectors stay compatible because
the underlying model is the same.
Captured from a beast production hotfix.
Pin mem0ai[nlp]==2.0.2 and fastembed for the new hybrid-search pipeline.
Drop OSS graph memory (removed upstream in 2.0.0, PR #4805): remove Neo4j
service, env vars, volumes, and driver deps; mark /graph/relationships
deprecated. Rewrite Memory.search/get_all/chat/health call sites to use
the v2 filters={} + top_k API (entity IDs at top level now raise
ValueError). Tighten MCP remove_memory ownership check to O(1)
verify_memory_ownership so it doesn't silently truncate at the new
top_k=20 default. Downgrade base image to python:3.12-slim for spaCy.
Adds scripts/migrate_qdrant_to_v3.py (scroll+upsert with per-user count
parity check) and docs/MIGRATION_RUNBOOK.md covering snapshot, dump,
collection rebuild, cutover, and rollback procedures.
- Make Ollama URL configurable via OLLAMA_BASE_URL env var
- Add version: v1.1 to Mem0 config (required for latest features)
- Make embedding model and dimensions configurable
- Fix ownership check: O(1) lookup instead of fetching 10k records
- Add tenacity retry logic for database operations
- Add auth to /models and /users endpoints
- Add rate limiting to all endpoints (10-120/min based on operation type)
- Fix 11 info disclosure issues (detail=str(e) -> generic message)
- Fix 2 silent except blocks with proper logging
- Fix 7 raise e -> raise for proper exception chaining
- Fix health check to not expose exception details
- Update tests with X-API-Key headers and security tests
Exposes memory operations as MCP tools over /mcp endpoint:
- add_memory, search_memory, remove_memory, chat
- API key auth via x-api-key or Authorization header
- User isolation enforced via contextvars
Security Enhancement:
- Remove external port exposure for PostgreSQL and Neo4j databases
- Replace 'ports' with 'expose' for internal-only database access
- Maintain full internal connectivity while eliminating external attack vectors
- Follow container security best practices
Benchmarking Framework:
- Add agent1.md: Professional Manager persona testing protocol
- Add agent2.md: Creative Researcher persona testing protocol
- Add benchmark1.md: Baseline test results and analysis
Benchmark Results Summary:
- Core engine quality: 4.9/5 average across both agent personas
- Memory intelligence: Exceptional context retention and relationship inference
- Automatic relationship generation: 50+ meaningful connections from minimal inputs
- Multi-project context management: Seamless switching with persistent context
- Cross-domain synthesis: AI-native capabilities for knowledge work enhancement
Key Findings:
- Core memory technology provides strong competitive moat
- Memory-enhanced conversations unique in market
- Ready for frontend wrapper development
- Establishes quality baseline for future model comparisons
Future Use: Framework enables systematic comparison across different
LLM endpoints, models, and configurations using identical test protocols.