Axolotl and Fine-Tuning Lecture Notes

Jul 27, 2024

Axolotl and Fine-Tuning Lecture Notes

Introduction

  • Agenda:
    • Overview of Axolotl
    • Honeycomb example
    • Q&A with Wing
    • Zach's discussion on parallelism and Hugging Face Accelerate
    • Fine-tuning on modal
    • Final Q&A

Fine-Tuning Basics

  • Model Capacity Questions:
    • What model to fine-tune?
    • Should I use LoRa or full fine-tune?

Choosing the Base Model

  • Model Size:
    • Options: 7B, 13B, 70B parameter models.
    • 7B is often preferred due to:
      • Faster training times
      • Easier to obtain compatible GPUs
      • Popularity with most users
  • Model Family:
    • Examples: Llama 2, Llama 3, Mistral, Zephyr, Gemma.
    • Use trending models from communities like Hugging Face to guide selections.

LoRa vs. Full Fine-Tuning

  • LoRa (Low-Rank Adaptation):
    • Strategy to learn lower-dimensional alterations in the model layers.
    • Typical input/output dimensions for LoRa adjustments of 4000 dimensions with reduced parameter count.
    • Majority of fine-tuning efforts are either through LoRa or QLoRa.
    • Recommended for most users to start with LoRa.
  • Quantization (QLoRa):
    • Uses fewer bits for parameters, can produce lighter models while sacrificing some precision.
    • Less noticeable performance impact than expected.

Axolotl Usage

  • Getting Started:

    • Start with example YAML configurations for simplicity.
    • Important flags include:
      • LoRaR: size of LoRa matrices.
      • LoRaAlpha: scaling parameter for LoRa.
  • Pre-processing Data:

    • Pre-process your data with command-line flags to set up in the correct format for Axolotl.
    • Review the output for correct formatting, especially checking for tokenization integrity.
  • Training Process:

    • Use accelerate launch commands for training jobs.
    • Include parameters for successful runs, such as logging runs with Weights and Biases.

Honeycomb Case Study

  • Overview of Honeycomb:

    • Observability platform with a goal of simplifying user query language.
    • Implemented LLM to interpret natural language queries and convert them to complex queries in their domain-specific language.
  • Evaluation Strategy:

    • Level 1 evals: Unit tests for correctness, adapted for query validations.
    • Level 2 evals: More complex evaluations including human feedback.

Modal Overview

  • What is Modal?

    • A cloud-based development platform that allows for an agile code development experience.
    • Supports parallel processing, beneficial for hyperparameter tuning.
  • Training with Modal:

    • Load Axolotl through Modal configured scripts easily handling data flags and required dependencies.
  • Key Commands for Modal:

    • modal run: Command to initiate training with required configurations.

Final Q&A and Takeaways

  • Deterministic Results:

    • Achievable through careful control of sampling strategies during inference rather than training.
  • Custom Evaluations During Training:

    • Potentially implemented through evaluation flags and logging capabilities integrated into Axolotl.
  • Using Smaller Models:

    • 7B models typically considered sufficient for most tasks; smaller sizes may lose reasoning quality.
  • Webinars and Resources:

    • YouTube presentations and community discussions to learn deeper insights available after the session.