Machine Learning Course - Lecture 1

Course Overview

Lecturer: Yaser Abu-Mostafa
Topics: The course covers a mix of mathematical theory and practical aspects of machine learning. Topics are color-coded to indicate this balance.
Structure: Course follows a storyline:
- What is learning?
- Can we learn?
- How to learn?
- How to learn well?
- Take-home lessons
Exception: Third lecture is a practical topic included early for tools to test theoretical aspects.

Logo: The course's logo is a technical figure to be discussed later.
Outline:
- Introduction to Machine Learning
- Example: Movie ratings prediction
- Mathematical formalization of the learning problem
- First machine learning algorithm
- Survey of types of learning
- Puzzle to understand learning intricacies

Problem: Predict how a viewer would rate a movie (e.g., Netflix).
Components of learning problem:
- Pattern Exists: Viewer ratings are consistent with other ratings and personal tastes.
- Cannot Pin Down Mathematically: Need to learn from data as we cannot manually define a predictive function.
- Data Availability: Essential for learning.
Solution Approach:
- Describe viewer/movie as vectors of factors (e.g., comedy, action, etc.).
- Machine Learning Approach: Start with random factors and adjust based on data (ratings) until patterns emerge.

Applicant Information (Example: Credit Approval):
- Input (x): Customer application data (e.g., age, salary)
- Output (y): Approval decision (+1 or -1)
- Target Function (f): Ideal unknown formula for approval.
- Data: Examples from historical records.
- Hypothesis (g): Approximation of f, derived from data.
Learning Algorithm: Processes data to produce g.
Hypothesis Set (H): Set of possible hypotheses from which g is chosen.

Input: Vector of customer attributes
Hypothesis Set: Linear combination of attributes
Learning Algorithm (Perceptron):
- Start with random weights
- Adjust weights to reduce misclassification
- Guaranteed to converge if data is linearly separable
Linear Inseparability: Techniques to handle this will be discussed.

Supervised Learning: Data includes input-output pairs. Focus of the course.
- Example: Coin recognition
Unsupervised Learning: Only inputs are provided; cluster finding.
- Example: Data clustering without labels
Reinforcement Learning: Learning through graded responses.
- Example: Game playing (backgammon)

Puzzle: Given known examples, predict the output of an unknown function.
Illustrative Point: Highlights the challenge of generalizing from finite data to unknown future data.

Linearly Separable Data: Techniques like mapping and modifications (e.g., pocket algorithm) will be covered.
High-dimensional Data: Computational challenges increase with dimensions.
Pattern Detection: Learning feasible when a pattern exists, determined via theory.
Bias and Sampling: Address sampling bias and model generalization.
Types of Hypotheses: Finite and continuous hypothesis sets, generalization based on the size and complexity.
Feedback Mechanisms: Validation and reinforcement learning as feedback methods.