Building Human-friendly RAG: Benchmarking Six Major LLMs on Structured Knowledge Extraction


RAGLLMClaudeChatGPTQwenDoubaoGrokGeminiLogseqBenchmarkIan Chou
Claude RAG Benchmark Figure 1

Introduction: When RAG Meets Knowledge Graphs

When building a personal knowledge base (like Logseq) or developing a Human-friendly RAG (Retrieval-Augmented Generation) system, the biggest challenge we face is often not “information acquisition,” but rather “information structuring” and “precision.”

Recently, I conducted an experiment: I asked six mainstream AI models (Claude Opus 4.5, ChatGPT 5.1, Qwen 3 Max, Doubao 1.6, Grok 4.1, and Gemini 3.0 Pro) to generate structured knowledge cards suitable for Logseq, focusing on one of the most misunderstood concepts in React—useEffect.

The results were surprising. While all models could generate correct code, there was a significant disparity in their ability to “construct mental models” and “structure knowledge.”

The Winner: Why Claude Opus 4.5 is the Best Choice (9.5/10)

Among all models, Claude demonstrated “Senior Architect” level thinking. It didn’t just explain the syntax; it reshaped our understanding of the technology.

1. Precision of Definitions

This is where Claude pulled ahead. Look at its description of the essence of useEffect:

Claude: “The essence of useEffect is not a lifecycle hook, but a ‘Reactive Data Pipeline’, used to synchronize external systems with React state.”

Other Models: Mostly stuck at the level of “similar to componentDidMount” or “handling side effects.”

This phrasing is crucial for RAG systems because it reduces fuzzy matching during semantic retrieval and provides extremely high Explainability.

2. Reshaping the Mental Model

Claude’s explanation of the Dependency Array was brilliant:

  • Common Misconception: It is a tool for “optimizing performance.”
  • Claude’s Definition: It is not an “optimization” tool, but a “Correctness” tool.

This profound insight directly helps learners establish the correct thinking path, which is the core of building a high-quality knowledge graph.

3. Perfect Instructional Structure

Claude’s output came with an inherent logical hierarchy: Basic ConceptsAnti-patternsSolutionsAdvanced Analogies (Synchronizer). This can be almost directly converted into a hierarchy in Logseq without manual reorganization.

Comprehensive Review: Positioning and Strengths

Although Claude took the crown, other models have irreplaceable advantages in specific scenarios. Here is a detailed horizontal comparison:

ModelPositioningScoreCharacteristicsBest Use Case
ClaudeThe Architect9.5Deep understanding, rigorous structure, strong instructional designDeep learning, system architecture, building knowledge skeletons
ChatGPTThe Veteran9.0Rich scenarios, broad coverage, high detailSolving specific bugs, code review, supplementary examples
QwenThe Innovator8.5Unique “Gap” analysis, novel perspectivesFinding technical blind spots, inspiring new ideas
DoubaoThe Pragmatist8.0Grounded, natural phrasing (in Chinese)Quick start, basic team documentation
GrokThe Minimalist8.0Extremely concise, straight to the pointQuick memos, Cheatsheets
GeminiThe Generalist7.0Core accuracy, standard performanceConcept shorthand
  • ChatGPT (The Veteran): Although slightly behind Claude in conceptual depth, it provided the richest “Combat Scenarios” and “Error Cases,” making it a great helper for solving specific engineering problems.
  • Qwen (The Innovator): A pleasant surprise was Qwen’s concept of the “Educational Gap,” pointing out the disconnect between developer intuition and the React model. This was a valuable and unique perspective.
Claude RAG Benchmark Figure 2

Practical Advice: How to Build the Perfect RAG Knowledge Base

Based on this benchmark, for friends looking to build a high-quality knowledge base in Logseq, I recommend a “Hybrid Strategy”:

Step 1: Use Claude to Build the Skeleton

Leverage Claude’s extremely strong structuring capabilities to establish the core hierarchy of the knowledge graph.

useEffect Knowledge Graph/
├── Core Concepts (Claude)
│ └── Reactive Data Pipeline
├── Mental Models (Claude)
│ └── Synchronizer Analogy
...

Step 2: Use ChatGPT to Fill in the Flesh

Mount the rich practical scenarios (Edge cases) and common errors (specific Stale Closure examples) provided by ChatGPT onto the corresponding nodes.

Step 3: Use Qwen to Find Blind Spots

Refer to Qwen’s “Gap” analysis to annotate “why this is error-prone” in your notes, increasing the depth of the knowledge.

Conclusion

In the era of AI-assisted learning and knowledge management, the choice of model determines the quality of knowledge.

If you are pursuing “quick problem solving,” ChatGPT remains the king; but if you are pursuing “profound understanding of the essence” and wish to build a long-term reusable structured knowledge base, Claude is currently the undisputed Best Architect.


Appendix 1: Knowledge Points Generated by Claude Opus 4.5

[
  {
    "id": "kp_01",
    "type": "Concept",
    "topic": "The essence of useEffect is a Reactive Data Pipeline",
    "core_statement": "The essence of useEffect is not a lifecycle hook, but a 'Reactive Data Pipeline' used to synchronize external systems with React state.",
    "implication": "Developers transitioning from class components need to shift their mental model and should not apply old lifecycle thinking.",
    "context_scenario": "When designing useEffect logic",
    "tags": ["React", "useEffect", "mental-model", "reactive-programming"]
  },
  {
    "id": "kp_02",
    "type": "Concept",
    "topic": "Stale Closure Issue",
    "core_statement": "Stale closure refers to when variables inside an effect retain old values rather than the latest state due to closure characteristics.",
    "implication": "JavaScript closures 'capture' the variable value at that moment; if the effect does not re-run, the variable will not update.",
    "context_scenario": "When an effect uses state or props but they are not added to the dependency array",
    "tags": ["React", "useEffect", "closure", "JavaScript", "bug"]
  },
  {
    "id": "kp_03",
    "type": "Anti-Pattern",
    "topic": "Intentionally omitting dependencies to avoid re-execution",
    "core_statement": "Intentionally omitting dependencies in the array to prevent the effect from re-running causes stale closure issues.",
    "implication": "The correct solution is to use useCallback, useMemo, or refactor logic, not to hide dependencies.",
    "context_scenario": "When designing the dependency array for useEffect",
    "tags": ["React", "useEffect", "anti-pattern", "dependency-array"]
  },
  {
    "id": "kp_04",
    "type": "Anti-Pattern",
    "topic": "Stuffing too much logic into a single effect",
    "core_statement": "Putting all side-effect logic into a single useEffect makes code hard to maintain and violates the Single Responsibility Principle.",
    "implication": "Mixed effects make it hard to track which dependency triggered which part of the logic, and hard to test independently.",
    "context_scenario": "When a component has multiple different types of side effects",
    "tags": ["React", "useEffect", "anti-pattern", "single-responsibility"]
  },
  {
    "id": "kp_05",
    "type": "Procedure",
    "topic": "Splitting multiple effects by logical unit",
    "core_statement": "Side effects should be split into multiple independent useEffects based on 'logical units,' where each effect is responsible for only one thing.",
    "implication": "After splitting, each effect has an independent dependency array, making it easier to reason about and maintain.",
    "context_scenario": "When designing the side-effect architecture of a React component",
    "tags": ["React", "useEffect", "best-practice", "separation-of-concerns"]
  },
  {
    "id": "kp_06",
    "type": "Example",
    "topic": "Effect Splitting Example: Data Fetching vs. Event Subscription",
    "core_statement": "One effect specifically handles data fetching, and another specifically handles event subscription, managed separately.",
    "implication": "Data fetching usually depends on ID or query parameters, while event subscription usually depends on handler functions; their lifecycles differ.",
    "context_scenario": "In a component that needs to fetch an API and listen to WebSocket or DOM events simultaneously",
    "tags": [
      "React",
      "useEffect",
      "example",
      "data-fetching",
      "event-subscription"
    ]
  },
  {
    "id": "kp_07",
    "type": "Anti-Pattern",
    "topic": "Using useEffect to simulate componentDidMount",
    "core_statement": "You should not use useEffect to simulate the componentDidMount lifecycle method of class components.",
    "implication": "This thinking leads developers to misunderstand the purpose of useEffect, resulting in problematic side-effect logic design.",
    "context_scenario": "When migrating from class components to function components",
    "tags": ["React", "useEffect", "anti-pattern", "lifecycle", "migration"]
  },
  {
    "id": "kp_08",
    "type": "Procedure",
    "topic": "Using empty dependency array for initialization",
    "core_statement": "If you only want logic to run once at component initialization, use an empty dependency array [], but ensure the effect contains no changing values.",
    "implication": "If the effect uses changing values but is given an empty array, it will produce a stale closure.",
    "context_scenario": "When one-time initialization is needed (e.g., analytics tracking, third-party SDK init)",
    "tags": ["React", "useEffect", "dependency-array", "initialization"]
  },
  {
    "id": "kp_09",
    "type": "Concept",
    "topic": "Side effects should be as pure as possible",
    "core_statement": "Side effects in useEffect should be as 'pure' as possible: predictable, cleanable, and without unexpected external influence.",
    "implication": "Pure side effects are easier to test and debug, and issues are more easily detected by React's Strict Mode.",
    "context_scenario": "When writing any useEffect logic",
    "tags": ["React", "useEffect", "purity", "best-practice"]
  },
  {
    "id": "kp_10",
    "type": "Anti-Pattern",
    "topic": "Fetch API without handling race conditions",
    "core_statement": "Using fetch API in useEffect without handling race conditions can cause old requests to overwrite new ones or memory leaks when users switch pages quickly.",
    "implication": "When the async result returns, the component may have already unmounted or dependencies may have changed.",
    "context_scenario": "When making API requests inside useEffect",
    "tags": ["React", "useEffect", "fetch", "race-condition", "async"]
  },
  {
    "id": "kp_11",
    "type": "Procedure",
    "topic": "Using AbortController to solve race conditions",
    "core_statement": "Cancel ongoing fetch requests via AbortController in the useEffect cleanup function to solve race conditions and memory leaks.",
    "implication": "AbortController is part of the Web API standard and can be used with the fetch signal option.",
    "context_scenario": "Handling fetch request cancellation in useEffect cleanup",
    "tags": ["React", "useEffect", "AbortController", "cleanup", "fetch"]
  },
  {
    "id": "kp_12",
    "type": "Anti-Pattern",
    "topic": "Heavy use of isMounted boolean flags in effects",
    "core_statement": "If you find yourself heavily using boolean switches (like isMounted) to control logic inside an effect, it indicates a fundamental problem with data flow design.",
    "implication": "Boolean switches are a patch, not a cure; the real issue lies in side-effect logic not correctly corresponding to data flow.",
    "context_scenario": "When developers try to 'fix' useEffect execution issues using flag variables",
    "tags": ["React", "useEffect", "anti-pattern", "code-smell", "data-flow"]
  },
  {
    "id": "kp_13",
    "type": "Procedure",
    "topic": "Rethink data flow instead of patching",
    "core_statement": "When useEffect control logic becomes complex, step back and rethink the overall data flow design instead of continuing to patch with workarounds.",
    "implication": "Good React design makes data flow clear and side effects simple; complex effects are usually symptoms of design problems.",
    "context_scenario": "When useEffect becomes hard to understand or maintain",
    "tags": ["React", "useEffect", "refactoring", "data-flow", "architecture"]
  },
  {
    "id": "kp_14",
    "type": "Concept",
    "topic": "Core function of Dependency Array",
    "core_statement": "The dependency array tells React when to re-run the effect to ensure values inside the effect are synchronized with the latest state.",
    "implication": "The dependency array is not an 'optimization' tool, but a 'correctness' tool.",
    "context_scenario": "When understanding the basic mechanism of useEffect",
    "tags": ["React", "useEffect", "dependency-array", "fundamentals"]
  },
  {
    "id": "kp_15",
    "type": "Analogy",
    "topic": "Effect as a Synchronizer for external systems",
    "core_statement": "useEffect is like a 'Synchronizer', responsible for keeping React state consistent with external systems (DOM, API, subscriptions).",
    "implication": "This analogy helps understand why effects need cleanup: a synchronizer needs to disconnect the old sync before establishing a new one when state changes.",
    "context_scenario": "When explaining useEffect design philosophy to others",
    "tags": ["React", "useEffect", "analogy", "synchronization", "mental-model"]
  }
]

Appendix 2: Knowledge Points Generated by ChatGPT 5.1

[
  {
    "id": "kp_01",
    "type": "Concept",
    "topic": "useEffect power and risk",
    "core_statement": "React's useEffect Hook is a very powerful feature, but it is also very prone to misuse.",
    "implication": "Powerful abstractions usually come with high costs for misuse, requiring more rigorous usage patterns.",
    "context_scenario": "When handling side-effect logic in React functional components",
    "tags": ["react", "useEffect", "side-effects", "frontend"]
  },
  {
    "id": "kp_02",
    "type": "Concept",
    "topic": "Importance of Dependency Array",
    "core_statement": "The dependency array in useEffect is used to tell React: re-run the effect when these values change.",
    "implication": "Correctly filling the dependency array is key to avoiding errors and stale closures.",
    "context_scenario": "When writing useEffect and deciding its dependencies",
    "tags": ["react", "useEffect", "dependency-array"]
  },
  {
    "id": "kp_03",
    "type": "Anti-Pattern",
    "topic": "Intentionally omitting dependencies",
    "core_statement": "Many developers intentionally omit actually used dependencies in the dependency array out of fear that useEffect will re-run repeatedly.",
    "implication": "Omitting dependencies to avoid re-execution creates hidden bugs rather than solving performance issues.",
    "context_scenario": "Incorrectly removing dependencies while optimizing useEffect execution frequency",
    "tags": ["react", "useEffect", "anti-pattern", "dependency-array"]
  },
  {
    "id": "kp_04",
    "type": "Concept",
    "topic": "Stale Closure Problem",
    "core_statement": "When useEffect's dependency array lacks necessary dependencies, the closure inside the effect continues to capture old values, leading to stale closure issues.",
    "implication": "Stale closures make state or props appear unable to update, leading to hard-to-trace logic errors.",
    "context_scenario": "When using state or props inside an effect without correctly listing them in the dependency array",
    "tags": ["react", "useEffect", "closure", "bug"]
  },
  {
    "id": "kp_05",
    "type": "Concept",
    "topic": "Stale Closure Effect",
    "core_statement": "Stale closure causes variables used in useEffect to stay at old values forever, not updating with the latest state or props.",
    "implication": "The screen looks correct, but the logic uses old values, causing subtle behavioral anomalies.",
    "context_scenario": "When performing calculations, API calls, or registering handlers in an effect that depends on state values",
    "tags": ["react", "bug", "closure", "state"]
  },
  {
    "id": "kp_06",
    "type": "Procedure",
    "topic": "Splitting effects by logical unit",
    "core_statement": "The correct way to use useEffect is to split it into multiple effects based on 'logical units,' rather than stuffing all logic into one.",
    "implication": "Splitting effects improves readability, testability, and prevents dependencies from tangling.",
    "context_scenario": "When a component has multiple different side-effect logics to handle",
    "tags": ["react", "useEffect", "design", "refactoring"]
  },
  {
    "id": "kp_07",
    "type": "Example",
    "topic": "Effect Splitting Example",
    "core_statement": "For example, use one useEffect specifically for data fetching, and another specifically for event subscription.",
    "implication": "Separating different concerns into different effects describes data flow and lifecycle more clearly.",
    "context_scenario": "When fetching remote data and subscribing to window or DOM events in the same component",
    "tags": ["react", "example", "data-fetching", "events"]
  },
  {
    "id": "kp_08",
    "type": "Concept",
    "topic": "The Essence of useEffect",
    "core_statement": "The essence of useEffect is not traditional lifecycle, but a 'Reactive Data Pipeline'.",
    "implication": "Developers should think about useEffect from the perspective of data dependency and reactive updates, not transplanting thoughts from class lifecycles.",
    "context_scenario": "When designing side-effect logic during the transition from class to function components",
    "tags": ["react", "useEffect", "reactive", "mental-model"]
  },
  {
    "id": "kp_09",
    "type": "Anti-Pattern",
    "topic": "Simulating componentDidMount with useEffect",
    "core_statement": "One should not use useEffect to simulate class lifecycle methods like componentDidMount.",
    "implication": "Forcing simulation of old lifecycles hinders the adoption of the correct mental model for React Hooks and leads to incorrect dependency management.",
    "context_scenario": "When migrating logic from class components, attempting a 1-to-1 mapping to useEffect syntax",
    "tags": ["react", "anti-pattern", "lifecycle", "hooks-migration"]
  },
  {
    "id": "kp_10",
    "type": "Concept",
    "topic": "Semantics of empty dependency array",
    "core_statement": "If you only want logic to run once at initialization, use an empty dependency array, indicating the effect runs only once after the first render.",
    "implication": "An empty dependency array is a strong guarantee that the effect will not re-run due to any value changes.",
    "context_scenario": "Setup, data preloading, or initialization logic required only when component mounts",
    "tags": ["react", "useEffect", "dependency-array", "initialization"]
  },
  {
    "id": "kp_11",
    "type": "Anti-Pattern",
    "topic": "Initialization effect containing changing values",
    "core_statement": "When using an empty dependency array, the effect should not contain values that will change over time, otherwise logic and actual execution count will mismatch.",
    "implication": "Using changing values in an empty dependency array effectively turns the stale closure problem into a design feature.",
    "context_scenario": "Using props or state that will update later in an initialization effect without re-running the effect",
    "tags": ["react", "anti-pattern", "closure", "dependency-array"]
  },
  {
    "id": "kp_12",
    "type": "Concept",
    "topic": "Side effects should be as pure as possible",
    "core_statement": "Side-effect logic in useEffect should be as pure as possible, avoiding unnecessary external dependencies and hidden states.",
    "implication": "Pure side effects are easier to reason about, refactor, and test, and less prone to race conditions or memory leaks.",
    "context_scenario": "When designing side-effect logic for data fetching, event subscription, DOM manipulation, etc.",
    "tags": ["react", "side-effects", "purity", "design"]
  },
  {
    "id": "kp_13",
    "type": "Concept",
    "topic": "Fetch Race Condition Risk",
    "core_statement": "When using fetch API, race conditions must be handled to prevent wrong updates or memory leaks when users switch pages or states quickly.",
    "implication": "Unhandled race conditions may cause unmounted components to still attempt setState or process stale responses.",
    "context_scenario": "When performing async data fetching in useEffect and updating state based on the result",
    "tags": ["react", "fetch", "race-condition", "memory-leak"]
  },
  {
    "id": "kp_14",
    "type": "Procedure",
    "topic": "Using AbortController for fetch",
    "core_statement": "A common practice to handle fetch API race conditions and avoid memory leaks is to use AbortController in useEffect to abort requests that are no longer needed.",
    "implication": "Aborting stale requests avoids unnecessary network costs and incorrect UI updates.",
    "context_scenario": "When using useEffect to send fetch requests and the component might unmount or conditions might change before completion",
    "tags": ["react", "fetch", "AbortController", "cleanup"]
  },
  {
    "id": "kp_15",
    "type": "Anti-Pattern",
    "topic": "Heavy use of boolean flags in effect",
    "core_statement": "If boolean switches (e.g., isMounted) are heavily used inside useEffect to control flow, it indicates a problem with the program logic design.",
    "implication": "Excessive boolean flags are usually a code smell of poor data flow design, not a problem that should be covered by more patches.",
    "context_scenario": "Using multiple boolean states in an effect to avoid duplicate calls or judge if initialized",
    "tags": ["react", "anti-pattern", "code-smell", "state-management"]
  },
  {
    "id": "kp_16",
    "type": "Procedure",
    "topic": "Action after detecting boolean code smells",
    "core_statement": "When finding yourself relying on many boolean switches in useEffect, you should rethink and refactor the data flow instead of adding more conditional checks to patch it.",
    "implication": "Redesigning data flow can usually fundamentally eliminate race conditions, duplicate execution, and initialization judgment issues.",
    "context_scenario": "When maintaining a piece of useEffect internal logic that becomes complex and full of isXXX flags",
    "tags": ["react", "refactoring", "architecture", "state-management"]
  }
]

Appendix 3: System Prompt Used for Testing

You are an expert Knowledge Engineer and card-designer for spaced repetition.

Your job:
Given an article or note, decompose it into **atomic Knowledge Points (KPs)** and output them as a **JSON array**.  
Do NOT summarize the text. Instead, extract individual, reusable knowledge units.

---

## CORE RULES

1. **Atomicity**

   - One KP = one main idea.
   - Do NOT mix multiple ideas into one KP.

2. **Self-contained**

   - Each KP must be understandable without seeing the original text.
   - Avoid pronouns like "this", "it", "they". Replace them with explicit nouns.

3. **Full spectrum of knowledge**

   - Do NOT only extract definitions and facts.
   - Always look for procedures, pitfalls, examples, analogies, and open questions.

4. **Max count**
   - Extract up to 30 KPs, focusing on the most useful ones for learning and thinking.

---

## KNOWLEDGE TYPE CLASSIFICATION

For each KP, choose exactly ONE `type` from:

1. **Concept** – Definitions, facts, properties. (“What is it?”)
2. **Procedure** – Step-by-step methods, algorithms, workflows. (“How to do it?”)
3. **Anti-Pattern** – Common mistakes, traps, things to avoid. (“What goes wrong?”)
4. **Analogy** – Metaphors, comparisons to other domains. (“X is like Y because…”)
5. **Example** – Concrete cases, scenarios, code examples, historical events. (“For example…”)
6. **Gap** – Limitations, open problems, future work, unknowns. (“We still don’t know…”)

---

## INFERENCE RULES (IMPLICIT KNOWLEDGE)

Besides what is explicitly written, infer short implications when they are **obvious to a domain expert**:

- If the text describes a **solution**, the hidden KP may be the **problem** it solves.
- If the text describes a **benefit**, the hidden KP may be the **trade-off or cost**.
- If a specific tool/term is used, infer its broader **category or role** when helpful.
- If nothing meaningful can be inferred, use an empty string "" for `implication`.

---

## OUTPUT FORMAT (JSON ONLY)

Return ONLY valid JSON. No explanations, no comments.

Each KP must have this structure:

[
{
"id": "kp_01",
"type": "Concept | Procedure | Anti-Pattern | Analogy | Example | Gap",
"topic": "Short 2-6 word title of the idea",
"core_statement": "One atomic statement capturing the essence of this KP.",
"implication": "Very short hidden context, trade-off, or 'why this matters'. Empty string if none.",
"context_scenario": "Where or when this KP applies (e.g., 'In React useEffect hooks'). Empty string if not needed.",
"tags": ["tag1", "tag2", "tag3"]
}
]

- Use English for `type` and `tags`.
- `core_statement` and `context_scenario` can be in the same language as the input text.
- If some fields are not applicable, set them to an empty string "" (except `id`, `type`, `topic`, `core_statement` which are required).

Appendix 4: Sample Article Used for Testing

React's useEffect Hook is one of its most powerful features, but it is also frequently misused. The most common issues arise with the dependency array. Many developers, fearing that the effect will re-run repeatedly, intentionally omit dependencies, leading to stale closure problems. This causes variables inside the effect to forever use old values.

Another common mistake is stuffing all logic into the same effect. The correct approach is to split multiple effects based on "logical units." For example: one effect handles data fetching, another handles event subscriptions.

The essence of useEffect is actually not a lifecycle hook, but a "reactive data pipeline." So do not use it to simulate componentDidMount. If you only want logic to execute once at initialization, you should use an empty dependency array, but ensure there are no gradually changing values inside the effect.

Furthermore, side effects should be as pure as possible. For example: fetch APIs need to handle race conditions to avoid memory leaks when users quickly switch pages. This is usually solved via AbortController.

Finally, if you find yourself heavily using boolean switches (e.g., isMounted) inside an effect, it means your program logic design is flawed. You should rethink the data flow instead of continuing to patch it.