Machine Learning Models

Jul 21, 2024

Lecture on Machine Learning Models

Introduction

  • Importance of prior videos: foundation for understanding machine learning and its usefulness
  • Key term: model

What is a Model?

  • A model is a representation of reality
  • Data set: a large collection of data representing reality
    • More data = closer to reality
    • Sometimes a data set alone is not enough

Splitting Data Sets

  • Representation of reality by splitting data into sections
  • Example: Predicting diabetes
    • Data set includes health information, age, family history, etc.
    • Each piece of data represents an instance of reality

Simplified Data Representation

  • Simplification for understanding purpose
    • E.g., Binary data (yes/no, male/female)
  • Creating a table with this simplified data
    • Columns: Attributes like sex, age, family history
    • Extra Column: Whether they had diabetes

Using Historical Data

  • Historical data helps in making accurate models
  • Modeling reality using if statements
    • Example: If sex is male, age < 50, history = true, then diabetes = false
  • Simplifies understanding but not practical for large data sets

Predictive Data Analytics vs. Machine Learning

  • For data analytics: Fine to use simple if statements/model
  • For machine learning: Deals with incomplete data representation
    • Predict best model for unknown data sections
    • The model is the choice the algorithm makes

Possibilities and Model Selection

  • How many models are possible?
    • Example situation: four models
    • Consider only rows without data
    • Different combinations (e.g., both true, both false)
  • Machine learning algorithm chooses the best possible model
    • Based on other data

Conclusion

  • Machine learning models: most likely representations of reality from incomplete data
  • Video aim: Better understanding of what a model is in machine learning

End Note: The importance of understanding models and how machine learning selects the best model.