🛠️

Insights from Sophia Young on Fine-Tuning

Jul 31, 2024

Notes on Lecture by Sophia Young on Fine-Tuning Models

Introduction

  • Speaker: Sophia Young, Lead Developer Relations at MOL
  • Overview of the talk:
    • Overview of M models
    • Introduction of Fine-Tune API
    • Open-source fine-tuning codebase
    • Demos

Company Overview

  • MOL is based in Paris with a team of over 50 people.
  • Founded about a year ago.
    • September Last Year: Released first model, MR 7B.
    • December: Released Mixol (Time SB).
    • Commercial Model: MR Medium and API platform for model use.
    • February: Released MR Small and MR Large (flagship model with advanced capabilities).
    • April: Released open-source model 8 times 22B, best of its kind at the time.
    • Recent Release: Craw, specialized model trained on 80+ programming languages.

Model Offerings

  • Three open-source models available for personal or commercial use.
  • Two enterprise-grade models: MR Small and MR Large.
    • MR Large:
      • Supports multilingual function calling.
      • Specialized for retrieval-augmented generation (RAG).
    • Fine-tuning support available for MR Small and MR 7B.
  • Emphasis on customization and user-specific needs.

Fine-Tuning Overview

  • Fine-Tune API: Released to allow customization of models directly.
  • Laura Fine-Tuning: Efficient and performant; analysis showed similar performance between Laura fine-tuning and full fine-tuning on MR 7B and MR Small.
  • Comparison Results:
    • Laura Fine-Tuning: 0.9
    • Full Fine-Tuning: 0.91

Prompting vs Fine-Tuning

  • Prompting:
    • Allows out-of-the-box functionality without data or training.
    • Easily updated for new workflows or prototyping.
  • Fine-Tuning Advantages:
    • Often better performance than prompting.
    • Can work faster and more economically than lengthy prompts.
    • Better alignment with specific tasks due to focused training.

Demos

Demo Setup

  • Installation: Ensure the latest version (0.4.0) of MOL API is installed.
  • Using Fine-Tuned Models:
    • Example of generating abstracts from research paper titles.
    • Chatbot example using a medical dataset.
  • Model Naming Convention:
    • Structure of model names shows the model it is fine-tuned on.

Case Studies

  • Showcased various developer examples using the Fine-Tune API.
  • Applications include:
    • Internet retrieval
    • Medical domain applications
    • Financial conversation assistants
    • Legal co-pilots

End-to-End Example

  • Preparation Steps:
    • Install MOL AI and required packages.
    • Prepare and format the dataset for training.
    • Upload dataset to the server and define the model for fine-tuning.
    • Monitor training jobs and retrieve metrics.

Open Source Codebase

  • Use the open-source codebase for fine-tuning MR 7B and other models.
  • Example of downloading model and preparing data in Google Colab.
  • Important to define configuration files for hyperparameters and paths.

Conclusion and Event Announcement

  • Exciting news: Hosting a Fine-Tune Hackathon from today to June 30th.
  • Encouragement to participate and showcase builds.

Questions and Answers

  • Validation of data format is available through validation scripts.
  • Support for further inquiries on fine-tuning processes.