Module 4 - Supplemental Lecture 1 - Youtube - Neural Networks - Introduction

Overview

This lecture introduces neural networks as key technologies driving recent advances in artificial intelligence (AI), explains their relationship to machine learning, and demonstrates how predictive models evolve from simple linear regression to deep neural networks.

Foundations of AI and Neural Networks

Artificial intelligence (AI) enables computers to perform tasks that appear intelligent.
Machine learning is a subset of AI where computers learn patterns from historical data to make predictions.
Neural networks and deep learning are subfields of machine learning; deep learning uses larger, more complex neural networks.

Predictive Modeling Basics

Predicting house prices can be done with linear regression, combining variables (e.g., square feet, bedrooms) using weighted sums.
Regression produces parameter estimates (weights) for each input variable and an intercept.
The output formula for linear regression is: predicted value = intercept + (weight1 × variable1) + (weight2 × variable2).

Logistic Regression for Probability Prediction

Logistic regression predicts the probability of an event (e.g., customer churn) using input variables.
The model computes a weighted sum and applies a non-linear function (logistic function) to produce a probability between 0 and 1.
The logistic regression output requires applying a function to the linear combination to interpret results as probabilities.

Transition to Neural Networks

Neural networks generalize these models by stacking multiple weighted sums and non-linear functions in layers.
Each layer receives inputs, combines them using weights, and applies a function before passing output to the next layer.
"Hidden layers" between input and output increase model complexity and accuracy.

Neural Network Structure and Training

A neural network consists of input, hidden, and output layers connected by weights.
Training involves adjusting all weights to minimize prediction errors, which becomes challenging as the network grows.
Efficient training became possible with large datasets, improved algorithms (backpropagation), and specialized hardware (GPUs).

Model Design Choices

Key neural network design choices include number of hidden layers, number of units per layer, and choice of activation functions (e.g., hyperbolic tangent).
More layers or units do not always improve accuracy; trial and error is often used to find optimal configurations.

Convolutional Neural Networks (CNNs)

CNNs are specialized neural networks mainly used for image recognition.
They automatically learn which connections (weighted sums) to use by identifying patterns in localized image regions.
CNNs are inspired by the structure of the human visual cortex.

Key Terms & Definitions

Artificial Intelligence (AI) — Computers performing tasks that appear intelligent.
Machine Learning — Methods for learning patterns from data for predictions.
Neural Network — Interconnected layers of nodes using weighted sums and functions to model data.
Deep Learning — Use of neural networks with many hidden layers.
Linear Regression — Predictive model using weighted sums of input variables.
Logistic Regression — Predictive model outputting probabilities by applying a function to a weighted sum.
Hidden Layer — Intermediate layer in a neural network between input and output.
Activation Function — Non-linear function applied to a layer’s output (e.g., hyperbolic tangent).
Backpropagation — Training algorithm for adjusting neural network weights.
GPU (Graphics Processing Unit) — Specialized processor that speeds up neural network training.
Convolutional Neural Network (CNN) — Neural network architecture for pattern recognition, especially in images.

Action Items / Next Steps

Review how to construct and interpret linear and logistic regression models.
Experiment with building neural networks using available software tools (e.g., SAS JMP, Python).
Prepare for the next lecture on handwriting recognition using neural networks.