Enhancing RAG with Knowledge Graphs

Introduction

Aim: To provide deeper understanding of corpus concepts and relations.
Sense Making: Understanding core connections among entities (people, places, events, concepts).
Extracting entities and their relationships for better answers to queries.

Offline Steps (Indexing Time)
- Chunk documents.
- Extract element instances (entities and relationships).
- Summarize entities and relationships.
- Cluster similar entities into communities.
- Summarize these communities.
Query Time (Lookup Time)
- Use community summaries for initial query processing.
- Generate intermediate answers from community summaries.
- Rank and score intermediate answers using LLM.
- Concatenate top-ranked answers for final global answer.

Document Chunking
- Experiments with various chunk sizes.
Extracting Concepts and Relationships
- Heavy computational step using few-shot prompting with LLM.
Summarizing Entities and Relationships
- Summarize extracted concepts and their connections through LLM prompts.
Community Clustering
- Nodes (entities) clustered based on strong relationship edges.
- Resulting in mutually exclusive and collectively exhaustive hierarchy.
- Summarize these clusters (community summaries).

Using Community Summaries
- Chunk and shuffle community summaries.
- Generate and rank intermediate answers for each chunk.
- Top-ranked answers are used for final comprehensive answer.

Graph RAG vs. Naive RAG: Better performance in comprehensiveness and diversity.
Example Comparison: Public figures in entertainment articles.
- Graph RAG showed deeper understanding by categorizing figures rather than listing frequently mentioned names.

Graph RAG enhances traditional RAG with deeper understanding via entity graphs.
Produces more comprehensive, direct answers to complex, corpus-wide queries.
Trend indicates movement towards integrating knowledge graphs in RAG techniques.