Lecture Notes: Building Knowledge Graphs with LLMs
Introduction
- Speaker: Tomas Bratanic
- Topic: Building Knowledge Graphs with LLMs
- Related Work: Writing a book on the topic with Oscar
Overview
- Discussion on building knowledge graphs using LLMs
- Addressing limitations of text embedding approaches
Limitations of Text Embedding Approach
- Common pipeline: PDFs → chunking → indexing → embedding models
- Works for documentation but not for all domains
- Problems:
- Legal domains and specific questions (e.g., total contract values)
- Issues with naive vector similarity searches
- Metadata filtering can help but has limits
- Aggregation, counting, and filtering needed in some queries
Structured Data and Knowledge Graphs
- Introduction to structured data in knowledge graphs
- Knowledge graphs integrate structured and unstructured data
Building Knowledge Graphs with LLMs
- Importance of representing structured and unstructured data
- Knowledge graphs allow for combining information from multiple documents
- Example: Multihop questions simplified with knowledge graphs
Information Extraction with LLMs
- Information extraction traditionally complex, now simplified with LLMs
- Different approaches:
- Generic: No predefined schema, inconsistent results
- Middle Ground: Define node labels and relationships, better consistency
- Domain Specific: Detailed schema definition, best results
- Tools: LangChain, OpenAI JSON outputs, etc.
Practical Observations
- Post-processing: Entity resolution to merge duplicated nodes
- Extraction consistency affected by document ambiguity and chunk size
- Multiple passes for more thorough extraction
- Different methods for extraction, model-specific results
Challenges and Solutions
- Entity resolution and merging duplicated nodes
- Use text embeddings and additional logic to resolve
- LLM used as a judge for entity merging
Q&A Session
- Updating Knowledge Graphs:
- Easy to add new information but requires entity resolution
- Graph Disambiguation:
- Use text embeddings and logic to identify and merge duplicates
Conclusion
- Resources: For practical examples and code, refer to Tomas' blog
- Encouragement to explore presentations and learn more about knowledge graphs
Note: These notes aim to summarize key points from the lecture and facilitate easier review and understanding of the topic discussed.