History

Pratik Narola c864fe8895 small chang		2025-08-12 16:25:43 +05:30
..
config.py	Initial commit: Production-ready Mem0 interface with monitoring	2025-08-10 17:34:41 +05:30
Dockerfile	Initial commit: Production-ready Mem0 interface with monitoring	2025-08-10 17:34:41 +05:30
main.py	Initial commit: Production-ready Mem0 interface with monitoring	2025-08-10 17:34:41 +05:30
mem0_manager.py	small chang	2025-08-12 16:25:43 +05:30
models.py	Initial commit: Production-ready Mem0 interface with monitoring	2025-08-10 17:34:41 +05:30
monitoring.py	Initial commit: Production-ready Mem0 interface with monitoring	2025-08-10 17:34:41 +05:30
README.md	Initial commit: Production-ready Mem0 interface with monitoring	2025-08-10 17:34:41 +05:30
requirements.txt	Initial commit: Production-ready Mem0 interface with monitoring	2025-08-10 17:34:41 +05:30

README.md

Mem0 Interface Backend

A production-ready FastAPI backend that provides intelligent memory integration with Mem0, featuring comprehensive memory management, real-time monitoring, and enterprise-grade observability.

🏗️ Architecture Overview

Core Components

Mem0Manager (mem0_manager.py)
- Central orchestration of memory operations
- Integration with custom OpenAI-compatible endpoint
- Memory persistence across PostgreSQL and Neo4j
- Performance timing and operation tracking
Configuration System (config.py)
- Environment-based configuration management
- Database connection management
- Security and CORS settings
- API endpoint configuration
API Layer (main.py)
- RESTful endpoints for all memory operations
- Request middleware with correlation IDs
- Real-time statistics and monitoring endpoints
- Enhanced error handling and logging
Data Models (models.py)
- Pydantic models for request/response validation
- Statistics and monitoring response models
- Type safety and automatic documentation
Monitoring System (monitoring.py)
- Thread-safe statistics collection
- Performance timing decorators
- Correlation ID generation
- Real-time analytics tracking

Database Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   PostgreSQL    │    │     Neo4j       │    │  Custom LLM     │
│   (pgvector)    │    │   (APOC)        │    │   Endpoint      │
├─────────────────┤    ├─────────────────┤    ├─────────────────┤
│ • Vector Store  │    │ • Graph Store   │    │ • claude-sonnet-4│
│ • Embeddings    │    │ • Relationships │    │ • Gemini Embed │
│ • Memory History│    │ • Entity Links  │    │ • Single Model  │
│ • Metadata      │    │ • APOC Functions│    │ • Reliable API  │
└─────────────────┘    └─────────────────┘    └─────────────────┘
        │                       │                       │
        └───────────────────────┼───────────────────────┘
                                │
                    ┌─────────────────┐
                    │   Mem0 Manager  │
                    │                 │
                    │ • Memory Ops    │
                    │ • Performance   │
                    │ • Monitoring    │
                    │ • Analytics     │
                    └─────────────────┘
                                │
                    ┌─────────────────┐
                    │  Monitoring     │
                    │                 │
                    │ • Correlation   │
                    │ • Timing        │
                    │ • Statistics    │
                    │ • Health        │
                    └─────────────────┘

Production Monitoring

@timed("operation_name")  # Automatic timing and logging
async def memory_operation():
    # Operation with full observability
    pass

# Request tracing with correlation IDs
correlation_id = generate_correlation_id()  # 8-char unique ID

# Real-time statistics collection
stats.record_api_call(user_id, response_time_ms)
stats.record_memory_operation("add")

Monitoring Features:

Request correlation IDs for end-to-end tracing
Performance timing for all operations
Real-time API usage statistics
Memory operation breakdown
Error tracking with context

🚀 Features

Core Memory Operations

✅ Intelligent Chat: Context-aware conversations with memory
✅ Memory CRUD: Add, search, update, delete memories
✅ Semantic Search: Vector-based similarity search
✅ Graph Relationships: Entity extraction and relationship mapping
✅ User Isolation: Separate memory spaces per user_id
✅ Memory History: Track memory evolution over time

Production Features

✅ Custom Endpoint Integration: Full support for OpenAI-compatible endpoints
✅ Real-time Monitoring: Performance timing and usage analytics
✅ Request Tracing: Correlation IDs for end-to-end debugging
✅ Statistics APIs: Global and user-specific metrics endpoints
✅ Graph Memory: Neo4j-powered relationship tracking
✅ Health Monitoring: Comprehensive service health checks
✅ Structured Logging: Enhanced error tracking and debugging

📁 File Structure

backend/
├── main.py              # FastAPI application and endpoints
├── mem0_manager.py      # Core Mem0 integration with timing
├── monitoring.py        # Observability and statistics system
├── config.py            # Configuration management
├── models.py            # Pydantic data models (including stats)
├── requirements.txt     # Python dependencies
├── Dockerfile          # Container configuration
└── README.md           # This file

🔧 Configuration

Environment Variables

Variable	Description	Default	Required
`OPENAI_COMPAT_API_KEY`	Your custom endpoint API key	-	✅
`OPENAI_BASE_URL`	Custom endpoint URL	-	✅
`EMBEDDER_API_KEY`	Google Gemini API key for embeddings	-	✅
`POSTGRES_HOST`	PostgreSQL host	postgres	✅
`POSTGRES_PORT`	PostgreSQL port	5432	✅
`POSTGRES_DB`	Database name	mem0_db	✅
`POSTGRES_USER`	Database user	mem0_user	✅
`POSTGRES_PASSWORD`	Database password	-	✅
`NEO4J_URI`	Neo4j connection URI	bolt://neo4j:7687	✅
`NEO4J_USERNAME`	Neo4j username	neo4j	✅
`NEO4J_PASSWORD`	Neo4j password	-	✅
`DEFAULT_MODEL`	Default model for general use	claude-sonnet-4	❌
`LOG_LEVEL`	Logging level	INFO	❌
`CORS_ORIGINS`	Allowed CORS origins	http://localhost:3000	❌

Production Configuration

Current production setup uses a simplified, reliable architecture:

# Single model approach for stability
DEFAULT_MODEL = "claude-sonnet-4"

# Embeddings via Google Gemini
EMBEDDER_CONFIG = {
    "provider": "gemini",
    "model": "models/gemini-embedding-001",
    "embedding_dims": 1536
}

# Monitoring configuration
MONITORING_CONFIG = {
    "correlation_ids": True,
    "operation_timing": True,
    "statistics_collection": True,
    "slow_request_threshold": 2000  # ms
}

🔗 API Endpoints

Core Chat

POST /chat - Enhanced chat with memory integration

Memory Management

POST /memories - Add memories manually
POST /memories/search - Search memories semantically
GET /memories/{user_id} - Get user memories
PUT /memories - Update specific memory
DELETE /memories/{memory_id} - Delete memory
DELETE /memories/user/{user_id} - Delete all user memories

Graph Operations

GET /graph/relationships/{user_id} - Get user relationship graph

Monitoring & Analytics

GET /stats - Global application statistics
GET /stats/{user_id} - User-specific metrics
GET /health - Service health check
GET /models - Available models and configuration

🏃‍♂️ Quick Start

Set up environment:

cp ../.env.example ../.env
# Edit .env with your configuration

Install dependencies:

pip install -r requirements.txt

Run the server:

uvicorn main:app --reload --host 0.0.0.0 --port 8000

Check health:

curl http://localhost:8000/health

View API docs:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc

🧪 Testing

Verified Test Results (2025-01-05)

All core functionality has been tested and verified as working:

1. Health Check ✅

curl http://localhost:8000/health
# Result: All memory services show "healthy" status
# - openai_endpoint: healthy
# - memory_o4-mini: healthy  
# - memory_gemini-2.5-pro: healthy
# - memory_claude-sonnet-4: healthy
# - memory_o3: healthy

2. Memory Operations ✅

# Add Memory
curl -X POST http://localhost:8000/memories \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"My name is Alice"}],"user_id":"alice"}'
# Result: Memory successfully extracted and stored with graph relationships

# Search Memory  
curl -X POST http://localhost:8000/memories/search \
  -H "Content-Type: application/json" \
  -d '{"query":"Alice","user_id":"alice"}'
# Result: Returns stored memory with similarity score 0.097

3. Memory-Enhanced Chat ✅

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"message":"What do you remember about me?","user_id":"alice"}'
# Result: "I remember that your name is Alice" 
# memories_used: 1 (successfully retrieved and used stored memory)

4. Intelligent Model Routing ✅

# Simple task → o4-mini
curl -X POST http://localhost:8000/chat \
  -d '{"message":"Hi there","user_id":"test"}'
# Result: model_used: "o4-mini", complexity: "simple"

# Expert task → o3  
curl -X POST http://localhost:8000/chat \
  -d '{"message":"Please analyze the pros and cons of microservices","user_id":"test"}'
# Result: model_used: "o3", complexity: "expert"

5. Neo4j Vector Functions ✅

# Verified Neo4j 5.18 vector.similarity.cosine() function
docker exec mem0-neo4j cypher-shell -u neo4j -p mem0_neo4j_password \
  "RETURN vector.similarity.cosine([1.0, 2.0, 3.0], [1.0, 2.0, 3.0]) AS similarity;"
# Result: similarity = 1.0 (perfect match)

Key Integration Points Verified

✅ Ollama Embeddings: nomic-embed-text:latest working for vector generation
✅ PostgreSQL + pgvector: Vector storage and similarity search operational
✅ Neo4j 5.18: Graph relationships and native vector functions working
✅ Custom LLM Endpoint: All 4 models accessible and routing correctly
✅ Memory Persistence: Data survives container restarts via Docker volumes

🔍 Production Monitoring

Structured Logging with Correlation IDs

JSON-formatted logs with correlation IDs for request tracing
Performance timing for all operations
Enhanced error tracking with operation context
Slow request detection (>2 seconds)

Real-time Statistics Endpoints

Global Statistics (`GET /stats`)

{
  "total_memories": 0,
  "total_users": 1,
  "api_calls_today": 5,
  "avg_response_time_ms": 7106.26,
  "memory_operations": {
    "add": 1,
    "search": 2,
    "update": 0,
    "delete": 0
  },
  "uptime_seconds": 137.1
}

User Analytics (`GET /stats/{user_id}`)

{
  "user_id": "alice",
  "memory_count": 2,
  "relationship_count": 2,
  "last_activity": "2025-08-10T11:01:45.887157+00:00",
  "api_calls_today": 1,
  "avg_response_time_ms": 23091.93
}

Health Monitoring (`GET /health`)

{
  "status": "healthy",
  "services": {
    "openai_endpoint": "healthy",
    "mem0_memory": "healthy"
  },
  "timestamp": "2025-08-10T11:01:05.734615"
}

Performance Tracking Features

Correlation IDs: 8-character unique identifiers for request tracing
Operation Timing: Automatic timing for all memory operations
Statistics Collection: Thread-safe in-memory analytics
Error Context: Enhanced error messages with operation details
Slow Request Alerts: Automatic logging of requests >2 seconds

🐛 Troubleshooting

Resolved Issues & Solutions

1. Neo4j Vector Function Error ✅ RESOLVED

Problem: Unknown function 'vector.similarity.cosine'
Root Cause: Neo4j 5.15 doesn't support vector functions (introduced in 5.18)
Solution: Upgraded to Neo4j 5.18-community
Fix Applied: Updated docker-compose.yml: image: neo4j:5.18-community

2. Environment Variable Override ✅ RESOLVED

Problem: Shell environment variables overriding .env file
Root Cause: ~/.zshrc exports took precedence over Docker Compose .env
Solution: Set values directly in docker-compose.yml environment section
Fix Applied: Hard-coded API keys in docker-compose.yml

3. Model Availability Issues ✅ RESOLVED

Problem: gemini-2.5-pro showing as unavailable
Root Cause: Incorrect API endpoint configuration
Solution: Verified models with /v1/models endpoint, updated API keys
Fix Applied: Now all models (o4-mini, gemini-2.5-pro, claude-sonnet-4, o3) operational

4. Memory Initialization Failures ✅ RESOLVED

Problem: "No memory instance available" errors
Root Cause: Neo4j container starting after backend, vector functions missing
Solution: Sequential startup + Neo4j 5.18 upgrade
Fix Applied: All memory instances now healthy

Current Known Working Configuration

Docker Compose Settings

neo4j:
  image: neo4j:5.18-community  # Critical: Must be 5.18+ for vector functions
  
backend:
  environment:
    OPENAI_API_KEY: sk-your-api-key-here  # Set in docker-compose.yml
    OPENAI_BASE_URL: https://your-openai-compatible-endpoint.com/v1

Dependency Requirements

Neo4j 5.18+ (for vector.similarity.cosine function)
Ollama running locally with nomic-embed-text:latest
PostgreSQL with pgvector extension
Valid API keys for custom LLM endpoint

Debugging Commands

# Check Neo4j vector function availability
docker exec mem0-neo4j cypher-shell -u neo4j -p mem0_neo4j_password \
  "RETURN vector.similarity.cosine([1.0, 0.0], [1.0, 0.0]) AS test;"

# Verify API endpoint models
curl -H "Authorization: Bearer $API_KEY" $BASE_URL/v1/models | jq '.data[].id'

# Check Ollama embeddings
curl http://host.docker.internal:11434/api/tags

# Monitor backend logs  
docker logs mem0-backend --tail 20 -f

🔒 Security Considerations

API keys are loaded from environment variables
CORS is configured for specified origins
Input validation via Pydantic models
Structured logging excludes sensitive data
Database connections use authentication

📈 Performance

Optimizations Implemented

Model-specific parameter tuning
Intelligent routing to reduce costs
Memory caching within Mem0
Efficient database queries
Structured logging for monitoring

Expected Performance

Sub-50ms memory retrieval (with optimized setup)
90% token reduction through smart context injection
Intelligent model routing for cost efficiency

🔄 Development

Adding New Models

Add model configuration in config.py
Update routing logic in mem0_manager.py
Add model-specific parameters
Test with health checks

Adding New Endpoints

Define Pydantic models in models.py
Implement logic in mem0_manager.py
Add FastAPI endpoint in main.py
Update documentation and tests

📄 License

This POC is designed for demonstration and evaluation purposes.