add concept of code_knowledge and code_index

2025-06-12 15:41:17 +05:30 · 2025-06-12 15:41:17 +05:30 · 5c6a983087
commit 5c6a983087
parent fe30478f1f
2 changed files with 33 additions and 0 deletions
--- a/latest/Global.md
+++ b/latest/Global.md
@ -92,6 +92,17 @@ This section is a comprehensive reference for each file in my Memory Bank, detai
  - **Update Frequency**: Continuously during a task.
  - **Update Triggers**: At the start, during, and end of every task.
 - **`code_index.json` - The "Code Skeleton" (Automated)**:
  - **Purpose**: An automatically generated, disposable index containing only a list of file names and the function names within them. It provides a fresh, accurate map of what exists and where.
  - **Update Frequency**: On-demand or periodically.
  - **CRITICAL RULE**: This file **MUST NOT** be edited manually. It is a cache to be overwritten.
 - **`code_knowledge.json` - The "Code Flesh" (AI-Managed)**:
  - **Purpose**: A persistent knowledge base of granular details and subtleties for specific code elements. It is a key-value store where the key is a stable identifier (e.g., `filePath::functionName`) that is directly mapped from an entry in `code_index.json`.
  - **Update Frequency**: Constantly, as new insights are discovered.
  - **CRITICAL RULE**: To find knowledge about a function, first locate it in `code_index.json` to get its structure, then use its stable identifier as a key to look up the corresponding deep knowledge in this file.
 *I am free to create any more files if I feel like. Each specialized mode is free to create any number of files for memory bank.*
 ---
@ -103,6 +114,11 @@ This section provides practical guidelines for applying my core doctrine and pro
 ### 4.1. Practical Workflow Blueprints
 - **Debugging (Audit Trail Approach)**: A systematic investigation process: Observe -> Hypothesize -> Execute & Document -> Iterate -> Synthesize.
 - **Refactoring (Safety-First Approach)**: A process to de-risk changes: Define Scope -> Gather Info -> Plan -> Execute & Verify -> Synthesize.
 - **Granular Code Analysis (Symbex Model)**: The standard method for linking conceptual knowledge to specific code.
    1.  **Consult the Skeleton**: Use `code_index.json` to get an up-to-date map of the code structure and find the stable identifier for a target function or class.
    2.  **Consult the Flesh**: Use the stable identifier to look up any existing granular knowledge, subtleties, or past observations in `code_knowledge.json`.
    3.  **Synthesize and Act**: Combine the structural awareness from the index with the deep knowledge from the knowledge base to inform your action.
    4.  **Update the Flesh**: If a new, valuable, needle-point insight is discovered, add it to the `code_knowledge.json` file under the appropriate stable identifier.
 ### 4.2. Task Management Guidelines
 - **Creating a Task**: Update `currentTask.md` with objectives, a detailed plan, and an "Impact Analysis" for refactors.
--- a/memory_bank_best_practices.md
+++ b/memory_bank_best_practices.md
@ -51,6 +51,23 @@ A proven architecture for structuring this knowledge consists of the following c
 This structured approach ensures that when the AI needs to perform a task, it can consult a specific, relevant document rather than parsing a massive, undifferentiated blob of text, leading to more accurate and context-aware actions.
 ### Distinguishing Between a Knowledge Base and a Code Index
 While the seven-file architecture provides a robust framework for conceptual knowledge, a mature system benefits from explicitly distinguishing between two types of information stores:
 *   **The Knowledge Base (e.g., `techContext.md`, `systemPatterns.md`)**: This is the source of truth for the *why* behind the project. It contains conceptual, synthesized information like architectural decisions, rationales, and approved patterns. It is resilient to minor code changes and is curated through disciplined workflows.
 *   **The Code Index (e.g., an auto-generated `code_index.json`)**: This is a disposable, automated map of the codebase. It answers the question *what* is *where*. It is highly precise but brittle, and should be treated as a cache that can be regenerated at any time. It should **never** be edited manually.
 **The Hybrid Model Best Practice**:
 The most effective approach is a hybrid model that leverages both:
 1.  **Maintain the Conceptual Knowledge Base**: Continue using the core memory bank files to document high-level, resilient knowledge.
 2.  **Introduce an Automated Code Index**: Use tools to periodically parse the codebase and generate a detailed index of files, classes, and functions. This index is used for fast, precise lookups.
 3.  **Bridge the Gap**: The AI uses the **Code Index** for discovery (e.g., "Where is the `processPayment` function?") and the **Knowledge Base** for understanding (e.g., "What is our standard pattern for payment processing?"). Insights gained during a task are synthesized and added to the Knowledge Base, not the temporary index.
 This separation of concerns provides the precision of a detailed index without the maintenance overhead, while preserving the deep, conceptual knowledge that is crucial for long-term development.
 ## 2. Contextual Retrieval for Development Tasks
 Retrieval-Augmented Generation (RAG) is the process of fetching relevant information from a knowledge base to augment the AI's context before it generates a response. For software development, this is not a one-size-fits-all problem. The optimal retrieval strategy depends heavily on the specific task (e.g., debugging vs. refactoring).