MLPs and Non-Linear Problems

Jun 11, 2025

Overview

This lecture explains the limitations of single-layer perceptrons and introduces the concept of multi-layered perceptrons (MLPs), emphasizing why hidden layers are essential for solving complex, non-linearly separable problems.

Limitations of Single-Layer Perceptrons

  • Single-layer perceptrons process inputs by computing weighted sums and applying an activation function.
  • They can only solve linearly separable problems, where inputs can be divided by a straight line (or plane).
  • Example: Classifying points on a 2D plane split by a line demonstrates linear separability.
  • Perceptrons cannot handle problems where a single line cannot separate classes (non-linear separability).

Linearly Separable vs. Non-Linearly Separable Problems

  • Boolean "AND" and "OR" operations are linearly separable and solvable by perceptrons.
  • The "XOR" (exclusive OR) operation is not linearly separable; a single perceptron cannot solve it.
  • Minsky and Papert's book "Perceptrons" showed that perceptrons can't solve non-linearly separable problems like XOR.

Need for Multi-Layered Perceptrons

  • Complex problems (e.g., digit classification) require distinguishing patterns that are not linearly separable.
  • By combining several perceptrons in layers, these systems can solve problems single perceptrons cannot.
  • An MLP has input, hidden, and output layers; the hidden layer enables solving non-linear problems.
  • The hidden layer is "hidden" because it's not directly visible to the user—it processes and transforms data internally.

Structure of a Multi-Layered Perceptron

  • Inputs are connected to hidden neurons, which are further connected to output neurons.
  • Each connection has a weight that can be adjusted (learned).
  • More hidden layers and neurons increase a network’s ability to model complex patterns.

Neural Network Variations

  • Networks can have multiple hidden layers for greater complexity.
  • Other architectures include recurrent networks (feedback from output to input) and convolutional networks (useful for image processing).

Key Terms & Definitions

  • Perceptron — A basic neuron-like unit in neural networks that computes weighted sums of inputs.
  • Linearly Separable — Classes can be divided by a straight line or plane.
  • XOR (Exclusive OR) — Boolean operation true only if one input is true and the other is false; not linearly separable.
  • Multi-Layered Perceptron (MLP) — A neural network with multiple layers (input, hidden, output).
  • Hidden Layer — Layers between input and output that enable learning non-linear relationships.

Action Items / Next Steps

  • Watch the next video to see the diagram and construction of a multi-layered perceptron.
  • Prepare for upcoming content on matrix math and building a neural network library in JavaScript.