📈

M Models and Fine-Tuning Overview

Aug 10, 2024

Lecture Notes: Overview of M Models and Fine-Tuning API

Introduction

Speaker: Sophia Young, Lead Developer Relations at Mol.
Excited to discuss M models, recently released Fine Tune API, and open-source fine-tuning codebase.

Company Overview

Mrol: Founded in September last year, based in Paris with a team of over 50.
Key Model Releases:
- Mr 7B (Sept 2022)
- Mixol and Mr Medium (Dec 2022)
- Mr Small and Mr Large (Feb 2023)
  - Mr Large: Flagship model with advanced reasoning and multilingual capabilities.
- 8x 22b (April 2023)
- Craw: Specialized model trained on 80+ languages focusing on code generation.

Model Offerings

Open Source Models: Three models available for personal/commercial use.
Enterprise Models:
- Mr Small: Fine-tuning support.
- Mr Large: Advanced functionalities, multilingual, and function calling capabilities.
Emphasis on customization and fine-tuning.

Fine-Tuning Overview

Fine-Tuning Code Base: Released to allow users to fine-tune open-source models.
Fine Tun API: Launched to customize models directly.
Technology Used: Laura Fine-Tuning (efficient and performant).
- Performance comparison:
  - Mr 7B (Laura FT): 0.9
  - Full Fine-Tuning: 0.91

Prompting vs. Fine-Tuning

Prompting:
- No data/training required; works out of the box.
- Good for prototyping and quick updates.
Fine-Tuning:
- Can outperform larger models for specific use cases.
- More aligned with tasks of interest; learns new facts/information.

Demos

Fine-Tune Model Demo:
- Example: Input titles to generate research paper abstracts.
- Showcasing a model fine-tuned on title-abstract pairs.
Medical Chatbot Example:
- Trained on medical datasets to answer queries.
Data Generation:
- Generating data using a larger model like Mr Large for fine-tuning purposes.
Real-World Use Cases:
- Startups using fine-tuned models for various sectors (medical, finance, legal).

Using Fine-Tuning API

Installation: Install latest Mol API (0.4.0).
Data Preparation:
- Use parquet files; ensure data sizes are within limits (training: <512MB, evaluation: <1MB).
Job Creation:
- Define model and hyperparameters, create fine-tuning jobs, and monitor progress through metrics.

End-to-End Example

Codebase Setup:
- Clone MR Fine Tune repo and install required packages.
Define configurations, paths, and hyperparameters for fine-tuning.
Start training and monitor the checkpoints for inference.

Exciting News

Hackathon Announcement:
- Hosting a fine-tuning hackathon from today until June 30th.
- Participants can submit ideas through a Google form.

Conclusion

Encouragement to explore the fine-tuning capabilities and participate in the hackathon.
Thank you for attending!

Full transcript