Understanding AI: Foundation Models and Workflow

Sep 17, 2024

Lecture Notes on AI and Foundation Models

Introduction

  • Deep learning enables the creation of specialized AI models.
  • Requires data collection, labeling, training, and deployment.
  • Examples include customer service chatbots and fraud detection.

Traditional Model Development

  • Previously required starting from scratch for each new specialization.
  • Steps included data selection, labeling, model development, training, and validation.

Foundation Models

  • Foundation models offer a new paradigm in AI development.
  • Definition: Centralized effort to create a base model for adaptation.
  • Adaptation: Fine-tuning to develop specialized models, such as programming language translation.
  • Benefits: Speeds up AI model development through fine-tuning.

Workflow to Create AI Models

Stage 1: Prepare the Data

  • Need extensive data, potentially petabytes, from open-source and proprietary sources.
  • Data Processing:
    • Categorization (e.g., language, programming language).
    • Filtering (e.g., hate speech, copyrighted material, duplicate data).
  • Output: Base data pile, which can be versioned and tagged for governance.

Stage 2: Train the Model

  • Select a foundation model suited to the use case (e.g., generative, encoder-only).
  • Tokenization: Data pile converted to tokens.
  • Training Process:
    • Involves ordering tokens.
    • Time-consuming, requiring potentially thousands of GPUs.

Stage 3: Validate

  • Benchmark the trained model against performance metrics.
  • Create a model card to document results and benchmarks.

Stage 4: Tune

  • Involves the application developer persona.
  • Fine-tuning: Using local data and generating prompts.
  • Quick process compared to building models from scratch.

Stage 5: Deploy

  • Deploy model as a service or embed in applications close to the network edge.
  • Allows continuous iteration and improvement.

WatsonX Platform by IBM

  • Components:
    • WatsonX.data: Modern data lake house for data preparation.
    • WatsonX.governance: Manages data and model cards for governance.
    • WatsonX.AI: Enables application developers to engage in tuning.
  • Built on IBM's hybrid cloud platform, Red Hat OpenShift.

Conclusion

  • Foundation models revolutionize AI model development.
  • The five-stage workflow enhances model sophistication and development speed.