Commit graph

4 commits

Author SHA1 Message Date
Pratik Narola
fd109ef892 fix: backup_qdrant.sh uses Qdrant's snapshot path (/qdrant/snapshots/)
The Qdrant snapshots directory is /qdrant/snapshots/{collection}/, not
/qdrant/storage/collections/{collection}/snapshots/ as the script assumed.
Verified against running mem0-qdrant container on beast.
2026-05-23 19:56:38 +05:30
Pratik Narola
06875473b2 feat: enable reranker, automate backups, tune extraction prompt
Three S-effort wins from the post-migration audit:

#1 Enable Cohere reranker on both Memory.search call sites
   (rerank=True), over-fetch top_k=max(limit*3, 30) to give the
   reranker a 30-50 candidate pool, then truncate to the caller's
   limit. Bump reranker config to rerank-v3.5 (4096 ctx, multilingual
   — matters for Hindi/Hinglish traffic) and top_n 10 → 50 so the
   output cap doesn't truncate below typical over-fetch sizes. Cohere
   was configured but never invoked; this is the single biggest
   quality lift the audit surfaced.

#2 Add scripts/backup_qdrant.sh and scripts/restore_test.sh. Daily
   snapshot of both collections back-to-back, docker cp to local
   YYYY-MM-DD dir, optional rclone off-host, prune local >14d, emit
   Prometheus textfile metric. Weekly restore_test.sh restores into a
   transient collection and asserts point count parity. Closes the
   zero-automated-backup gap.

#3 Add CUSTOM_FACT_EXTRACTION_INSTRUCTIONS, wired via MemoryConfig's
   custom_instructions field. mem0 appends this as its own
   '## Custom Instructions' section in the additive-extraction user
   prompt (verified against generate_additive_extraction_prompt) —
   does not replace mem0's role/format guidance. Re-prioritizes the
   default consumer-organizer few-shots toward work/projects/
   relationships/recurring context, the actual usage pattern here.
2026-05-23 19:53:59 +05:30
Pratik Narola
3a10b72051 scripts: Neo4j → mem0 v2 graph relationship import
Two helpers built during the beast deployment to migrate the legacy
Neo4j knowledge graph (decommissioned in the v3 cutover) into mem0 v2
as natural-language memories.

scripts/import_neo4j_to_mem0.py
  - Connects to Neo4j via Bolt, iterates per-user relationships,
    POSTs each as a /memories request.
  - Two modes:
      raw:           "humanize(src) verb humanize(dest)." (snake_case → spaces)
      --llm-rewrite: minimax-m2 via OpenAI-compat proxy rewrites each
                     tuple into a grammatical English sentence; the LLM
                     may also output SKIP for non-meaningful tuples
                     (postal codes, timezone offsets, self-refs).
  - Tags every imported memory with metadata.source="neo4j_legacy_import"
    plus neo4j_rel_type + import_timestamp for traceability/cleanup.
  - Caches LLM rewrites by (source, rel, dest, user_id).

scripts/cleanup_neo4j_imports.py
  - Finds and DELETEs all memories with source="neo4j_legacy_import"
    for given users, via the /memories DELETE endpoint (per-user API
    key, so the deletes go through mem0's normal auth + cleanup path).

Run on beast (2026-05-23): 2007 Neo4j edges → 615 net new memories in
mem0_v3 (30.6% yield after LLM SKIPs + mem0 fact-extraction dedup).
mem0 v3's fact extractor correctly deduplicated edges that restated
facts already in vector memory (e.g., manju's 9 existing memories
absorbed all 17 of her Neo4j edges).
2026-05-23 18:27:00 +05:30
Pratik Narola
0f0addb36b chore: migrate to mem0ai v2.0.2 (V3 memory pipeline)
Pin mem0ai[nlp]==2.0.2 and fastembed for the new hybrid-search pipeline.
Drop OSS graph memory (removed upstream in 2.0.0, PR #4805): remove Neo4j
service, env vars, volumes, and driver deps; mark /graph/relationships
deprecated. Rewrite Memory.search/get_all/chat/health call sites to use
the v2 filters={} + top_k API (entity IDs at top level now raise
ValueError). Tighten MCP remove_memory ownership check to O(1)
verify_memory_ownership so it doesn't silently truncate at the new
top_k=20 default. Downgrade base image to python:3.12-slim for spaCy.

Adds scripts/migrate_qdrant_to_v3.py (scroll+upsert with per-user count
parity check) and docs/MIGRATION_RUNBOOK.md covering snapshot, dump,
collection rebuild, cutover, and rollback procedures.
2026-05-23 14:49:45 +05:30