Building a Neural Network from Scratch

Jul 20, 2024

Building a Neural Network from Scratch

Introduction

  • Building a neural network from scratch using only numpy and Python (no machine learning libraries).
  • Moving from individual neurons to complex networks.
  • Ultimate goal: Test network on number recognition and fashion datasets.

Single Neuron

Concept

  • A single neuron has three inputs and one output.
  • Output is the weighted sum of the inputs.
  • Each connection has a weight.
  • Formula: sum(input * weight) + bias

Learning Mechanism

  • Neural networks learn by tweaking weights and biases to get the desired output.

Complex Networks

Structure

  • Multiple neurons interconnected in layers.
  • Output of one layer becomes input to the next.
  • Linear algebra (dot product) simplifies computation.

Example

  • Convert complex neuron connections into one Python line using the dot product.

Introducing Non-Linearity

  • Linear networks are no better than linear regression.
  • Use ReLU (Rectified Linear Unit) for non-linearity.
  • ReLU: Helps in understanding non-linear data.

Softmax Activation Function

  • Converts network outputs into a probability distribution.
  • Shows likelihood of each class being correct.

Implementation

  • Implemented forward pass: from input to output.
  • Tested with pre-trained weights: achieved 97.53% accuracy.

Loss Calculation

  • Use cross-categorical entropy loss to measure prediction error.

Backpropagation

Concept

  • Adjust weights and biases to reduce prediction error.
  • Partial derivatives calculate the contribution of each weight to the output.
  • Process: Taste the dish (calculate loss), adjust ingredients (weights) to improve.

Steps

  1. Forward pass: Calculate output.
  2. Calculate loss: Measure error using cross-entropy loss.
  3. Backward pass: Update weights using partial derivatives.
  4. Repeat until the network improves.

Learning Rate

  • Controls how much the network changes during training.
  • Higher rates cause rapid changes; lower rates cause smaller adjustments.
  • Optimizers like SGD adjust learning rates over time for better performance.

Debugging and Optimization

  • Involved fixing bugs and refining code.
  • Small coding errors led to big issues (e.g., incorrect implementation of backpropagation).
  • Debugging and adjustments improved accuracy.

Testing with MNIST Dataset

  • MNIST: Dataset of 60,000 hand-written digit images (28x28 pixels).
  • Initial poor accuracy (~50%). After tweaks: 97.42% accuracy.
  • Network could recognize and predict digits accurately.

Fashion MNIST Dataset

  • Similar to MNIST but with images of clothing items (e.g., pants, sneakers, bags).
  • Trained network achieved 87% accuracy.
  • Performance can be improved with more tweaking.

Conclusion

  • From single neurons to complex networks capable of recognizing numbers and fashion items.
  • Highlighted challenges and successes in building a neural network from scratch.

Final Notes

  • Planning more content on neural network projects.