Creating Your Own LLM Classification System

Jul 8, 2024

Creating Your Own LLM Classification System

Introduction

  • Presenter: Dave Abar
  • Background: Founder of Data Lumina, 5 years of experience in custom data and AI solutions
  • Goal: Help viewers build their own LLM classification systems in 5 steps

Context and Objectives

  • Example: Classifying customer care tickets
  • General Applicability: Techniques can classify text, images, audio, etc.
  • Problem Definition: Classify customer care tickets to categorize them and add relevant metadata
  • Challenges:
    • No structured outputs initially
    • No validation leading to inconsistent categorization
    • Limited information extracted
    • No confidence score
    • Model creativity generating incorrect outputs

Step-by-Step Guide

Step 1: Define Objectives

  • Objectives:
    • Accurate category classification
    • Assess urgency and sentiment
    • Extract key information for resolution
    • Confidence score for human review
  • Business Impact:
    • Reduce average response time
    • Improve customer satisfaction
    • Increase efficiency
    • Optimize workforce allocation

Step 2: Use the Instructor Library

  • Library: Instructor Library to get structured data (e.g., JSON) from large language models
  • Installation: pip install instructor
  • Advantage: Easy integration and validation using Pydantic data models

Step 3: Define Data Models

  • Enumerations (Enums): Predefined categories for validation
    • Example: Order Issue, Incorrect Item Received
  • Pydantic Models: Define expected data structure and validation rules
    • Example attributes:
      • category (enum)
      • urgency (enum)
      • sentiment (enum)
      • confidence_score (float, 0 to 1)
      • key_information (list of strings)
      • suggested_action (string)

Step 4: Integration

  • Combine Data Models and LLM
  • Patch OpenAI Client with Instructor: Ensure response matches Pydantic data model
  • Validation and Error Handling: Handles incorrect data by retrying with corrected context
  • Practical Example: A customer ordered a laptop, received a tablet; urgency high, sentiment angry
  • Output:
    • Metadata extracted
    • Suggested actions
    • Suitable for customer care systems
  • Benefits: Reduces manual sorting and routing, improves efficiency

Step 5: Optimize and Experiment

  • Prompt Optimization: Define clear instructions, contexts, roles, and examples in prompts
  • Model Selection: Experiment with different models (e.g., GPT-3.5 Turbo, GPT-4)
    • Smaller models for simple tasks to reduce costs and increase speed
  • System Extension: Further automate responses for simpler queries based on classification
  • Results: Efficient classification with high confidence and accuracy

Conclusion

  • Building a robust classification system involves:
    • Clear objectives and business understanding
    • Utilizing libraries like Instructor for structured responses
    • Defining detailed and validated data models
    • Effective integration for error handling and retries
    • Continuous optimization and experimentation

Additional Resources

  • Freelancing Help: Link to Dave Abar's company offering help for developers looking to freelance
  • Further Learning: Video link on Data Lumina's complete process for building AI applications

Call to Action

  • Like and Subscribe: If the video was helpful