Lecture on Machine Learning in Image Compression

Introduction

Excitement around machine learning (ML) is justified.
ML has revolutionized computer vision and is making inroads in image compression.
Examples include learned intra prediction methods, in-loop filters, and upscaling filters.
Focus: Methods learned end-to-end for image compression.
End-to-end learned methods discussed in a special session (morning).
Potential for high complexity but promising results.

Example: Compression artifacts at low bitrates.
HEVC Intra: Visible block partitioning (stair casing), ringing around edges, structural distortions in objects, and flattened textures.
These artifacts arise from independently optimized coding tools being manually joined.

Nonlinear Transform Coding: A Fundamental Approach
- Replace linear transforms with neural networks (NNs).
- Uniform quantizer and simple arithmetic coder assumed.
Optimization Using ML Tools
- Loss function: rate + distortion trade-off.
- Stochastic gradient descent for parameter optimization.
Training Techniques
- During training: replace quantization with uniform noise addition for continuity.
- Adjust probability distributions used in coding.

Scalar Compression Example: Demonstrates learning non-linear transform and optimal representation.
2D Case Study (Banana Distribution): Constrained vs. non-constrained transforms show improved performance and flexibility.
Image Compression: Larger NNs (convolutions) used. Three layer depth for experiments.
Comparison Between Methods: JPEG, JPEG 2000, and nonlinear transform coder highlight improvements in naturalness and detail retention.

Definition: Easier extraction of useful info for classifiers or other predictors.
Autoencoders: Nonlinear transform + dimensionality reduction without entropy coding.
Variational Autoencoders: Combine autoencoders with variational Bayesian inference for generative modeling.
Comparisons: Example of ICA (subset of variational autoencoders).

Metrics Evaluated: PSNR, MSSIM, MS error rate.
Comparison with HEVC Intra: Significant improvements using hyperprior models and direct metric optimization.
Subjective Tests: Demonstrated practical visual improvements.
Challenges: Computational expenses, needs for more optimized hardware, and faster execution.

GANs: Used for generating enhanced images but challenging to stabilize.
Practical Coding Efforts: TensorFlow Compression Library by Google released for public use.
Image Retrieval Application: Image classification via compressed representations.
Subsequent Challenges and Research Needs: Future improvements including model storage costs, scalability, application to other domains (audio, video), and better evaluation methods.

Exciting potential for further exploration in ML-based compression.
Necessity for integrating better quality metrics and practical hardware implementations.
Current acceleration techniques show promise but more work needed to meet practical utility.