Understanding AI: Foundation Models and Workflow

Sep 17, 2024

Lecture Notes on AI and Foundation Models

Introduction

Deep learning enables the creation of specialized AI models.
Requires data collection, labeling, training, and deployment.
Examples include customer service chatbots and fraud detection.

Traditional Model Development

Previously required starting from scratch for each new specialization.
Steps included data selection, labeling, model development, training, and validation.

Foundation Models

Foundation models offer a new paradigm in AI development.
Definition: Centralized effort to create a base model for adaptation.
Adaptation: Fine-tuning to develop specialized models, such as programming language translation.
Benefits: Speeds up AI model development through fine-tuning.

Workflow to Create AI Models

Stage 1: Prepare the Data

Need extensive data, potentially petabytes, from open-source and proprietary sources.
Data Processing:
- Categorization (e.g., language, programming language).
- Filtering (e.g., hate speech, copyrighted material, duplicate data).
Output: Base data pile, which can be versioned and tagged for governance.

Stage 2: Train the Model

Select a foundation model suited to the use case (e.g., generative, encoder-only).
Tokenization: Data pile converted to tokens.
Training Process:
- Involves ordering tokens.
- Time-consuming, requiring potentially thousands of GPUs.

Stage 3: Validate

Benchmark the trained model against performance metrics.
Create a model card to document results and benchmarks.

Stage 4: Tune

Involves the application developer persona.
Fine-tuning: Using local data and generating prompts.
Quick process compared to building models from scratch.

Stage 5: Deploy

Deploy model as a service or embed in applications close to the network edge.
Allows continuous iteration and improvement.

WatsonX Platform by IBM

Components:
- WatsonX.data: Modern data lake house for data preparation.
- WatsonX.governance: Manages data and model cards for governance.
- WatsonX.AI: Enables application developers to engage in tuning.
Built on IBM's hybrid cloud platform, Red Hat OpenShift.

Conclusion

Foundation models revolutionize AI model development.
The five-stage workflow enhances model sophistication and development speed.

Full transcript