📊

Building Knowledge Graphs with LLMs Overview

Nov 13, 2024

Lecture Notes: Building Knowledge Graphs with LLMs

Introduction

  • Speaker: Tomas Bratanic
  • Topic: Building Knowledge Graphs with LLMs
  • Related Work: Writing a book on the topic with Oscar

Overview

  • Discussion on building knowledge graphs using LLMs
  • Addressing limitations of text embedding approaches

Limitations of Text Embedding Approach

  • Common pipeline: PDFs → chunking → indexing → embedding models
  • Works for documentation but not for all domains
  • Problems:
    • Legal domains and specific questions (e.g., total contract values)
    • Issues with naive vector similarity searches
    • Metadata filtering can help but has limits
    • Aggregation, counting, and filtering needed in some queries

Structured Data and Knowledge Graphs

  • Introduction to structured data in knowledge graphs
  • Knowledge graphs integrate structured and unstructured data

Building Knowledge Graphs with LLMs

  • Importance of representing structured and unstructured data
  • Knowledge graphs allow for combining information from multiple documents
  • Example: Multihop questions simplified with knowledge graphs

Information Extraction with LLMs

  • Information extraction traditionally complex, now simplified with LLMs
  • Different approaches:
    • Generic: No predefined schema, inconsistent results
    • Middle Ground: Define node labels and relationships, better consistency
    • Domain Specific: Detailed schema definition, best results
  • Tools: LangChain, OpenAI JSON outputs, etc.

Practical Observations

  • Post-processing: Entity resolution to merge duplicated nodes
  • Extraction consistency affected by document ambiguity and chunk size
  • Multiple passes for more thorough extraction
  • Different methods for extraction, model-specific results

Challenges and Solutions

  • Entity resolution and merging duplicated nodes
    • Use text embeddings and additional logic to resolve
    • LLM used as a judge for entity merging

Q&A Session

  • Updating Knowledge Graphs:
    • Easy to add new information but requires entity resolution
  • Graph Disambiguation:
    • Use text embeddings and logic to identify and merge duplicates

Conclusion

  • Resources: For practical examples and code, refer to Tomas' blog
  • Encouragement to explore presentations and learn more about knowledge graphs

Note: These notes aim to summarize key points from the lecture and facilitate easier review and understanding of the topic discussed.