🔍

Insights on Deep Networks and Spline Theory

Aug 21, 2024

Deep Networks and Approximation Theory

Introduction

  • Speaker shares insights on deep networks learned over the years.
  • Focus on approximation theory and splines to study deep nets.

Background on Deep Learning

  • Deep networks solve function approximation problems (e.g., object classification).
  • Example: Training data includes pictures of various objects to learn a function that classifies new images.
  • Regression problems: e.g., inferring seismic impedance from seismic imagery.

Structure of Deep Networks

  • Deep networks operate hierarchically by applying a sequence of functions/operations (layers).
  • Layers are operators mapping inputs to outputs.
  • Understanding deep networks requires familiarizing oneself with new terminology (e.g., "ReLU" for thresholding).

Local vs Global Behavior of Deep Nets

  • Layers are easy to describe locally but complex when composed globally.
  • Deep networks have many parameters and are often treated as black boxes.

Spline Approximation

  • Deep networks can be studied using spline theory, particularly piecewise affine splines (CPAs).
  • Free Knot Splines: Optimizes both partition and mappings but is complex.
  • Fixed Knot Splines: Simple method; commonly used in practice.

Key Insights into Deep Networks

  • Modern deep networks consist of continuous piecewise affine operators.
  • Composition of multiple layers retains the piecewise affine nature.
  • The mapping from input to output is a continuous piecewise affine spline operator.

Visualization of Deep Network Behavior

  • Each layer's transformation can be visualized as a hyperplane cutting the input space.
  • Input space is divided into partitions based on these hyperplane cuts.

Partition Formation

  • When composing layers, the hyperplane cuts evolve to preserve continuity.
  • Resulting decision boundaries in classification tasks can exhibit varying smoothness affecting generalization.

Applications and Further Analysis

  • Local Affine Mapping: Understanding the outputs as inner products leading to the concept of match filters.

Architectural Impact on Learning

  • Different architectures (e.g., ConvNets, ResNets, DenseNets) exhibit different optimization properties.
  • ResNets are empirically observed to converge faster and more stably than ConvNets.
  • Theoretical backing shows that ResNets lead to better conditioned loss surfaces, aiding optimization.

Generative Networks

  • Discusses generative adversarial networks (GANs) and variational autoencoders (VAEs).
  • Introduction of spline theory provides a framework to analyze generative networks.
  • Potential for using closed-form expressions and EM algorithms for learning in generative models.

Conclusion

  • Spline theory, specifically CPA splines, offers a solid foundation for analyzing deep learning.
  • The speaker is excited about ongoing research and invites questions.