knowledge-base

pratik/knowledge-base

Fork 0

Commit graph

Author	SHA1	Message	Date
Pratik Narola	e99b382b16	fix: restore_test.sh uses upload endpoint and tightens source-file glob file:// snapshot recovery is disabled by default in recent Qdrant (returns 403 on /collections/.../snapshots/recover with a file:// URL). Switched to POST /snapshots/upload with multipart form-data which doesn't need an allowlist. Also tightened the find -name glob from "${SOURCE_COLLECTION}_" to "${SOURCE_COLLECTION}_[0-9]" so a source named "mem0_v3" does not accidentally match "mem0_v3_entities_*" files in the same dir.	2026-05-23 19:57:41 +05:30
Pratik Narola	06875473b2	feat: enable reranker, automate backups, tune extraction prompt Three S-effort wins from the post-migration audit: #1 Enable Cohere reranker on both Memory.search call sites (rerank=True), over-fetch top_k=max(limit*3, 30) to give the reranker a 30-50 candidate pool, then truncate to the caller's limit. Bump reranker config to rerank-v3.5 (4096 ctx, multilingual — matters for Hindi/Hinglish traffic) and top_n 10 → 50 so the output cap doesn't truncate below typical over-fetch sizes. Cohere was configured but never invoked; this is the single biggest quality lift the audit surfaced. #2 Add scripts/backup_qdrant.sh and scripts/restore_test.sh. Daily snapshot of both collections back-to-back, docker cp to local YYYY-MM-DD dir, optional rclone off-host, prune local >14d, emit Prometheus textfile metric. Weekly restore_test.sh restores into a transient collection and asserts point count parity. Closes the zero-automated-backup gap. #3 Add CUSTOM_FACT_EXTRACTION_INSTRUCTIONS, wired via MemoryConfig's custom_instructions field. mem0 appends this as its own '## Custom Instructions' section in the additive-extraction user prompt (verified against generate_additive_extraction_prompt) — does not replace mem0's role/format guidance. Re-prioritizes the default consumer-organizer few-shots toward work/projects/ relationships/recurring context, the actual usage pattern here.	2026-05-23 19:53:59 +05:30

Author

SHA1

Message

Date

Pratik Narola

e99b382b16

fix: restore_test.sh uses upload endpoint and tightens source-file glob

file:// snapshot recovery is disabled by default in recent Qdrant
(returns 403 on /collections/.../snapshots/recover with a file:// URL).
Switched to POST /snapshots/upload with multipart form-data which
doesn't need an allowlist.

Also tightened the find -name glob from "${SOURCE_COLLECTION}_*" to
"${SOURCE_COLLECTION}_[0-9]*" so a source named "mem0_v3" does not
accidentally match "mem0_v3_entities_*" files in the same dir.

2026-05-23 19:57:41 +05:30

Pratik Narola

06875473b2

feat: enable reranker, automate backups, tune extraction prompt

Three S-effort wins from the post-migration audit:

#1 Enable Cohere reranker on both Memory.search call sites
   (rerank=True), over-fetch top_k=max(limit*3, 30) to give the
   reranker a 30-50 candidate pool, then truncate to the caller's
   limit. Bump reranker config to rerank-v3.5 (4096 ctx, multilingual
   — matters for Hindi/Hinglish traffic) and top_n 10 → 50 so the
   output cap doesn't truncate below typical over-fetch sizes. Cohere
   was configured but never invoked; this is the single biggest
   quality lift the audit surfaced.

#2 Add scripts/backup_qdrant.sh and scripts/restore_test.sh. Daily
   snapshot of both collections back-to-back, docker cp to local
   YYYY-MM-DD dir, optional rclone off-host, prune local >14d, emit
   Prometheus textfile metric. Weekly restore_test.sh restores into a
   transient collection and asserts point count parity. Closes the
   zero-automated-backup gap.

#3 Add CUSTOM_FACT_EXTRACTION_INSTRUCTIONS, wired via MemoryConfig's
   custom_instructions field. mem0 appends this as its own
   '## Custom Instructions' section in the additive-extraction user
   prompt (verified against generate_additive_extraction_prompt) —
   does not replace mem0's role/format guidance. Re-prioritizes the
   default consumer-organizer few-shots toward work/projects/
   relationships/recurring context, the actual usage pattern here.

2026-05-23 19:53:59 +05:30

2 commits