# Migration Runbook: mem0 v0.1.x/v1.x → v2.0.2 (the "V3 pipeline") This runbook covers the **operational** half of the migration — backups, the Qdrant collection rebuild for BM25, the Neo4j dump, cutover and rollback. The **code-level** half (Dockerfile, requirements, mem0_manager rewrites, etc.) is already committed on this branch; this document is what to follow when taking those code changes to a stack that has live data. ## TL;DR 1. Snapshot Qdrant + dump Neo4j (Phase 2). 2. Deploy v2 backend to a scratch stack (Phase 3). 3. Rebuild the Qdrant collection with BM25 by warm-up-add → scroll/upsert → swap (Phase 4). 4. Run integration tests (Phase 5). 5. Cutover production with the same steps inside a maintenance window (Phase 6). ## Phase 1 — Pre-flight ```bash # 1. Capture the exact running mem0ai version (record for the ticket) docker compose exec backend pip show mem0ai # 2. Tag the pre-migration commit git tag pre-mem0-v3-migration && git push --tags # 3. Verify free disk on the volumes (snapshots can be sizeable) docker system df -v | grep -E "qdrant_data|neo4j_data" ``` ## Phase 2 — Backups (read-only on production) ### Qdrant snapshot ```bash # Create snapshot of the live mem0 collection (qdrant runs inside the network as 'qdrant') docker compose exec backend curl -X POST \ "http://qdrant:6333/collections/mem0/snapshots?wait=true" # Returns JSON with the snapshot filename, e.g. mem0-XXXXXXXX.snapshot. # Copy it off the qdrant container's volume: docker compose exec qdrant ls /qdrant/storage/collections/mem0/snapshots/ mkdir -p ./backups/qdrant docker cp mem0-qdrant:/qdrant/storage/collections/mem0/snapshots/ ./backups/qdrant/ ``` ### Neo4j offline dump (decommission path) Neo4j 5.x requires the database to be stopped to dump it. ```bash mkdir -p ./backups/neo4j docker compose stop neo4j docker run --rm \ --volumes-from mem0-neo4j \ -v "$(pwd)/backups/neo4j:/dumps" \ neo4j:5.26.4 \ neo4j-admin database dump neo4j --to-path=/dumps # (No need to restart neo4j — it is being decommissioned.) ``` Keep both backups for **at least 30 days** post-cutover. Calendar a reminder. ### Pre-cutover per-user memory counts ```bash # Iterate API_KEYS users, hit /stats/{user_id}, save the count. Adjust per your auth. for user in $(jq -r 'values | unique[]' <<< "$API_KEYS"); do echo -n "$user: " docker compose exec backend curl -s -H "X-API-Key: " \ "http://localhost:8000/stats/$user" | jq -r '.memory_count // 0' done > pre-cutover-counts.txt ``` ## Phase 3 — Deploy v2 backend (scratch stack first) Use a developer or staging machine with a **restored copy** of the prod snapshot, not prod itself. ```bash # Restore the prod snapshot onto the scratch Qdrant docker compose exec backend curl -X POST \ "http://qdrant:6333/collections/mem0_legacy/snapshots/upload?priority=snapshot" \ -H "Content-Type: multipart/form-data" \ -F "snapshot=@/backups/qdrant/" # (or restore as 'mem0' if you want to start with the legacy name) # Build + start the v2 backend docker compose build --no-cache backend docker compose up -d docker compose logs -f backend ``` Watch for: - `Applied Claude/OpenAI-compatible patch: cleared top_p (and store)` — patch loaded. - `Initialized ultra-minimal Mem0Manager with custom endpoint` — startup OK. - No errors mentioning `graph_store` or `enable_graph` (we removed them). - On any search/get_all: no `ValueError` from filters. ## Phase 4 — Rebuild the Qdrant collection for BM25 Pre-v2 collections lack the `bm25` sparse-vector slot. mem0 v2 silently downgrades to semantic-only on them — to get full hybrid search you must recreate the collection. ```bash # 1. Set the env var so the v2 backend creates a NEW collection with the right schema docker compose exec backend sh -c 'QDRANT_COLLECTION_NAME=mem0_v3 \ python -c "from mem0_manager import mem0_manager; \ mem0_manager.memory.add([{\"role\":\"user\",\"content\":\"warm-up\"}], user_id=\"__warmup__\")"' # This lazy-creates mem0_v3 + mem0_v3_entities with the bm25 slot. # (Delete the warm-up memory after if you care.) # 2. Run the migration script — preserves id + vector + payload, no re-embed docker compose exec backend python /app/../scripts/migrate_qdrant_to_v3.py \ --source mem0 --target mem0_v3 \ --qdrant-host qdrant --qdrant-port 6333 --dry-run # Inspect the per-user counts. If OK, run for real: docker compose exec backend python /app/../scripts/migrate_qdrant_to_v3.py \ --source mem0 --target mem0_v3 \ --qdrant-host qdrant --qdrant-port 6333 # 3. Swap names. Qdrant has no in-place rename — use snapshot+upload. # Snapshot mem0_v3, upload as mem0_swap, then snapshot mem0 as mem0_legacy, then # upload mem0_swap as mem0. Or simply point QDRANT_COLLECTION_NAME at mem0_v3 in # docker-compose.yml and keep `mem0` around as the legacy backup. ``` Easiest path: **leave the legacy collection alone** and update `QDRANT_COLLECTION_NAME` to `mem0_v3` in `.env` / `docker-compose.yml`. The legacy `mem0` collection sits there as an extra backup until you delete it. ## Phase 5 — Integration tests ```bash MEM0_API_KEY= python test_integration.py -v ``` The test script generates a fresh `TEST_USER` per run — make sure the supplied API key maps to that user (see CLAUDE.md "There are no unit tests..." note). Expected: all pass. The `/graph/relationships/{user_id}` test should accept the new `deprecated: true` payload. ## Phase 6 — Production cutover Maintenance window ~30 min. 1. Communicate the window. 2. Re-snapshot Qdrant immediately before the deploy (so the rollback snapshot is the freshest possible). 3. `git pull` the migration branch (or merge to main first). 4. `docker compose build --no-cache backend && docker compose up -d backend`. 5. Run the Phase 4 collection rebuild on prod. 6. Smoke test: `/health`, one `/chat` round-trip, one `/memories` write, one `/memories/search` read. 7. Verify per-user counts match `pre-cutover-counts.txt` (use the same loop). ## Rollback ### Before the first v2 write hits prod (fully safe) ```bash git revert docker compose build --no-cache backend docker compose up -d backend ``` ### After cutover but snapshot still on disk (loses post-cutover writes) ```bash # Stop the backend so no more writes land on the v2 collection docker compose stop backend # Restore the pre-cutover Qdrant snapshot to a fresh name, then swap docker compose exec qdrant curl -X POST \ "http://qdrant:6333/collections/mem0_rollback/snapshots/upload?priority=snapshot" \ -H "Content-Type: multipart/form-data" \ -F "snapshot=@/qdrant/snapshots/" # Update QDRANT_COLLECTION_NAME=mem0_rollback or rename via snapshot+upload. # Restore Neo4j if needed docker run --rm \ --volumes-from mem0-neo4j \ -v "$(pwd)/backups/neo4j:/dumps" \ neo4j:5.26.4 \ neo4j-admin database load neo4j --from-path=/dumps --overwrite-destination=true # Revert code and restart git revert docker compose build --no-cache backend docker compose up -d backend ``` ### After snapshot retention expires **Irreversible.** Keep the pre-cutover snapshot and Neo4j dump for ≥30 days.