Probabilistic Modeling in Neural Networks

Jul 19, 2024

Lecture on Probabilistic Modeling in Neural Network Models

Introduction

  • Discussion of terms and their importance in scientific works
  • Identified two main issues: uncertainty in learning and out-of-distribution (OOD) detection
  • Goal: to explain the approach to learning and different types of uncertainty in neural network models

Types of Uncertainties

Aleatoric Uncertainty

  • Arises due to internal noise in the data
  • Difficult to estimate as it is due to the data collection process itself
  • Example: classes may overlap in the data space, causing uncertainty

Epistemic Uncertainty

  • Arises due to limited data and can be reduced by increasing data volume
  • Related to our knowledge about the data distributions
  • Example: we poorly understand data distribution in areas where samples are sparse

Handling Uncertainties in Neural Network Models

Variance Estimation in Regression

  • Known methods: Gaussian processes, linear regression, etc.
  • Estimation of uncertainties through confidence intervals

Estimating Uncertainties in Classification

  • Main measures: maximum probability and entropy of the distribution
  • Problem: models are often poorly calibrated, necessitating special calibration techniques

Methods to Improve Uncertainty Estimation

Ensemble Models

  • Training multiple models with different initializations
  • Models provide various predictions, and their spread can be used as an estimate of uncertainty
  • Advantage: high quality of uncertainty estimation
  • Disadvantage: high computational complexity

Monte Carlo Dropout

  • Using dropout at the prediction stage and obtaining multiple predictions
  • Estimating uncertainty through the variance of these predictions
  • Simple implementation, but often inferior to ensemble models in quality

Methods for Handling Out-of-Distribution (OOD) Data

Mixture Models and Dirichlet Distribution

  • Neural network predicts parameters of the Dirichlet distribution
  • Loss function minimizes the Kullback-Leibler divergence between the predicted and empirical distribution
  • Using additional samples for OOD data

Own Method for Uncertainty Estimation

Main Concept

  • Estimation of Bayesian risk and deviations from Bayesian risk (excess risk) through kernel density estimation
  • Considering marginal data density

Practical Implementation

  • Creating spectrally normalized neural networks used for stability and reducing overfitting
  • Defining uncertainty through a combination of Bayesian risk and expected variance

Experiments and Results

  • Testing on CIFAR-10, MNIST, and ImageNet datasets
  • Excellent results compared to classical methods, especially on out-of-distribution (OOD) data

Conclusions

  • Estimation of uncertainties in neural networks is important and requires a comprehensive approach
  • Combining probabilistic methods and modern neural network techniques can achieve high results