🧠

Probabilistic Modeling in Neural Networks

Jul 19, 2024

Lecture on Probabilistic Modeling in Neural Network Models

Introduction

Discussion of terms and their importance in scientific works
Identified two main issues: uncertainty in learning and out-of-distribution (OOD) detection
Goal: to explain the approach to learning and different types of uncertainty in neural network models

Types of Uncertainties

Aleatoric Uncertainty

Arises due to internal noise in the data
Difficult to estimate as it is due to the data collection process itself
Example: classes may overlap in the data space, causing uncertainty

Epistemic Uncertainty

Arises due to limited data and can be reduced by increasing data volume
Related to our knowledge about the data distributions
Example: we poorly understand data distribution in areas where samples are sparse

Handling Uncertainties in Neural Network Models

Variance Estimation in Regression

Known methods: Gaussian processes, linear regression, etc.
Estimation of uncertainties through confidence intervals

Estimating Uncertainties in Classification

Main measures: maximum probability and entropy of the distribution
Problem: models are often poorly calibrated, necessitating special calibration techniques

Methods to Improve Uncertainty Estimation

Ensemble Models

Training multiple models with different initializations
Models provide various predictions, and their spread can be used as an estimate of uncertainty
Advantage: high quality of uncertainty estimation
Disadvantage: high computational complexity

Monte Carlo Dropout

Using dropout at the prediction stage and obtaining multiple predictions
Estimating uncertainty through the variance of these predictions
Simple implementation, but often inferior to ensemble models in quality

Methods for Handling Out-of-Distribution (OOD) Data

Mixture Models and Dirichlet Distribution

Neural network predicts parameters of the Dirichlet distribution
Loss function minimizes the Kullback-Leibler divergence between the predicted and empirical distribution
Using additional samples for OOD data

Own Method for Uncertainty Estimation

Main Concept

Estimation of Bayesian risk and deviations from Bayesian risk (excess risk) through kernel density estimation
Considering marginal data density

Practical Implementation

Creating spectrally normalized neural networks used for stability and reducing overfitting
Defining uncertainty through a combination of Bayesian risk and expected variance

Experiments and Results

Testing on CIFAR-10, MNIST, and ImageNet datasets
Excellent results compared to classical methods, especially on out-of-distribution (OOD) data

Conclusions

Estimation of uncertainties in neural networks is important and requires a comprehensive approach
Combining probabilistic methods and modern neural network techniques can achieve high results

Full transcript