📚

Hacker's Guide to Language Models

Jul 1, 2024

Hacker's Guide to Language Models

Introduction

  • Presenter: Jeremy Howard from fast.ai
  • Purpose: Code-first approach for understanding and using language models
  • Prerequisite: Basics of deep learning (recommended course: course.fast.ai)

What is a Language Model?

  • Predicts the next word in a sentence or fills in missing words.
  • Examples:
    • Text DaVinci 003 by OpenAI
    • Nat.dev platform for experimenting with language models
  • Tokens: Sub-word units, whole words, punctuation, or numbers used to train language models.
  • Tokenization: Process of converting text into tokens using tools like 'tiktoken'.

ULMFit Algorithm

  • Steps:
    1. Language Model Training (Pre-training): Predict the next word in sentences using a large corpus (e.g., Wikipedia).
    2. Language Model Fine-Tuning: Adjust model using a dataset closer to the final task.
    3. Classifier Fine-Tuning: Optimize for the end task using methods like reinforcement learning from human feedback (RLHF).
  • Purpose: Compress world knowledge into neural network parameters.

Instruction Tuning and Fine-Tuning

  • Instruction Tuning: Uses datasets like OpenAker and Flan for question-answer pairs.
  • Classifier Fine-Tuning: Involves human feedback to improve model responses.
  • Importance: Pre-trained models are fine-tuned to perform specific tasks better.

Using GPT-4

  • Recommendation: Best language model as of September 2023.
  • Usage: Pay for GPT-4 through OpenAI for high-quality outputs.

Common Misconceptions

  • Incorrect claims that GPT-4 can't reason or solve specific problems.
  • Empirical tests showing GPT-4's strong performance on tasks presented as limitations.

Custom Instructions

  • Purpose: Enhance accuracy by priming the model for high-quality information.
  • Examples: Contextual instructions to improve reasoning and output quality.

Limitations

  • GPT-4 can't provide accurate information about itself, URLs, or post-September 2021 data.
  • Hallucinations: Model confidently provides incorrect information when it lacks knowledge.

Advanced Features and Tools

  • Advanced Data Analysis: GPT-4 can write and test code, though with some limitations.
  • Google's Bard: Can OCR texts directly within the prompt.

Practical Implementations

  • API Usage: Using OpenAI API for repetitive and programmatic tasks.
  • Rate Limits Handling: Use Bing to generate code for handling API rate limits.

Building a Code Interpreter

  • Function Calling: Pass functions to GPT-4 using JSON schema for custom tasks.
  • Example: Python function to calculate factorials.
  • Enhanced Functionality: Create custom functions for complex queries and tasks.

Running Language Models Locally

  • GPU Requirements: Use Kaggle, Colab, or rent server GPUs for local setup.
  • Library: Hugging Face’s 'transformers' library for model implementation.

Working with Hugging Face Models

  • Model Examples: Llama2, the Bloke’s GPTQ versions for optimized performance.
  • Instruction Tuning: Importance of following specific prompt formats for different models.

Retrieval Augmented Generation (RAG)

  • Overview: Enhances response quality by providing contextual information to the model from external documents.
  • Vector Databases: Use sentence transformers to match queries with relevant documents.
  • Examples: H2O GPT, private GPT setups.

Fine-Tuning Custom Models

  • Use Cases: Create a model fine-tuned to specific tasks (e.g., converting natural language to SQL queries).
  • Libraries: Hugging Face’s 'datasets' and fine-tuning tools like 'Axolotl'.

Running Models on Macs

  • Tools: MLC and llama.cpp for running models on Apple hardware.
  • Performance: Running quantized 7B models effectively.

Community and Support

  • Resources: fast.ai Discord channel for generative AI discussions and support.
  • Conclusion: Exciting but complex field; collaboration is crucial for navigating challenges.

Thank you for listening!