Foundation Models and AI Development

Jul 28, 2024

Lecture Notes: Foundation Models and AI Development

Introduction

  • Deep learning and AI Models: Enable detailed specialized AI models (e.g., customer service chatbots, fraud detection)
  • Traditional model building: Requires data selection, curation, labeling, development, training, validation
  • Foundation Models Paradigm: Centralized effort creating a base model adaptable to specialized models through fine-tuning

What is a Foundation Model?

  • Definition: A focused, centralized base model that can be adapted via fine-tuning
  • Example Use Case: Programming language translation starting with a foundational model and fine-tuning with specific data
  • Advantage: Rapidly speeds up AI model development

Workflow to Create an AI Model

Stage 1: Prepare the Data

  • Training Data: Requires large amounts of data, potentially petabytes across dozens of domains
  • Data Types: Combination of open-source and proprietary data
  • **Data Processing Tasks: **
    • Categorization: Describes the data (e.g., language categorization)
    • Filtering: Removes unwanted content (e.g., hate speech, copyrighted material)
    • Removing Duplicates: Ensures unique data
  • Output: Results in a base data pile, versioned and tagged for governance

Stage 2: Train the Model

  • Model Selection: Choose among many types (generative, encoder-only, lightweight, high parameter)
  • Tokenization: Converts data pile into tokens (potentially trillions)
  • Training Process: Training based on tokens; extensive computational resources and time required

Stage 3: Validate

  • Benchmarking: Assess model performance against benchmarks
  • Model Card Creation: Document training process and benchmark scores, primarily for data scientists

Stage 4: Tune

  • Persona: Application developers (not necessarily AI experts)
  • Engagement: Generate prompts for performance, provide additional local data
  • Duration: Hours or days, quicker than building from scratch

Stage 5: Deployment

  • Deployment Options:
    • Service Offering: Public cloud deployment
    • Embedded Application: Closer to network edge deployment
  • Iteration: Continue to iterate and improve the model

IBM watsonx Platform

  • Overview: Platform for enabling all 5 workflow stages
  • Components:
    • watsonx.data: Modern data lakehouse, connects data repositories
    • watsonx.governance: Manages data and model cards, ensures AI process governance
    • watsonx.ai: Allows application developers to engage with and fine-tune models
  • Foundation: Built on IBM’s Red Hat OpenShift hybrid cloud platform

Conclusion

  • Impact of Foundation Models: Changing the way specialized AI models are built
  • Advantages: Increased sophistication and rapid development of AI and AI-derived applications