Coconote
AI notes
AI voice & video notes
Try for free
Probabilistic Modeling in Neural Networks
Jul 19, 2024
Lecture on Probabilistic Modeling in Neural Network Models
Introduction
Discussion of terms and their importance in scientific works
Identified two main issues: uncertainty in learning and out-of-distribution (OOD) detection
Goal: to explain the approach to learning and different types of uncertainty in neural network models
Types of Uncertainties
Aleatoric Uncertainty
Arises due to internal noise in the data
Difficult to estimate as it is due to the data collection process itself
Example: classes may overlap in the data space, causing uncertainty
Epistemic Uncertainty
Arises due to limited data and can be reduced by increasing data volume
Related to our knowledge about the data distributions
Example: we poorly understand data distribution in areas where samples are sparse
Handling Uncertainties in Neural Network Models
Variance Estimation in Regression
Known methods: Gaussian processes, linear regression, etc.
Estimation of uncertainties through confidence intervals
Estimating Uncertainties in Classification
Main measures: maximum probability and entropy of the distribution
Problem: models are often poorly calibrated, necessitating special calibration techniques
Methods to Improve Uncertainty Estimation
Ensemble Models
Training multiple models with different initializations
Models provide various predictions, and their spread can be used as an estimate of uncertainty
Advantage: high quality of uncertainty estimation
Disadvantage: high computational complexity
Monte Carlo Dropout
Using dropout at the prediction stage and obtaining multiple predictions
Estimating uncertainty through the variance of these predictions
Simple implementation, but often inferior to ensemble models in quality
Methods for Handling Out-of-Distribution (OOD) Data
Mixture Models and Dirichlet Distribution
Neural network predicts parameters of the Dirichlet distribution
Loss function minimizes the Kullback-Leibler divergence between the predicted and empirical distribution
Using additional samples for OOD data
Own Method for Uncertainty Estimation
Main Concept
Estimation of Bayesian risk and deviations from Bayesian risk (excess risk) through kernel density estimation
Considering marginal data density
Practical Implementation
Creating spectrally normalized neural networks used for stability and reducing overfitting
Defining uncertainty through a combination of Bayesian risk and expected variance
Experiments and Results
Testing on CIFAR-10, MNIST, and ImageNet datasets
Excellent results compared to classical methods, especially on out-of-distribution (OOD) data
Conclusions
Estimation of uncertainties in neural networks is important and requires a comprehensive approach
Combining probabilistic methods and modern neural network techniques can achieve high results
📄
Full transcript