Lecture on Machine Learning in Image Compression

Jun 27, 2024

Lecture on Machine Learning in Image Compression

Introduction

  • Excitement around machine learning (ML) is justified.
  • ML has revolutionized computer vision and is making inroads in image compression.
  • Examples include learned intra prediction methods, in-loop filters, and upscaling filters.
  • Focus: Methods learned end-to-end for image compression.
  • End-to-end learned methods discussed in a special session (morning).
  • Potential for high complexity but promising results.

Traditional Methods and Their Limitations

  • Example: Compression artifacts at low bitrates.
  • HEVC Intra: Visible block partitioning (stair casing), ringing around edges, structural distortions in objects, and flattened textures.
  • These artifacts arise from independently optimized coding tools being manually joined.

End-to-End Trained Compression Methods

  • No stair casing, ringing, or unnatural distortions.
  • Automatically optimize rate allocation.
  • Produce smoother and more natural artifacts.
  • Examples demonstrate qualitative improvements over traditional methods.

Key Concepts in End-to-End Trained Methods

  1. Nonlinear Transform Coding: A Fundamental Approach
    • Replace linear transforms with neural networks (NNs).
    • Uniform quantizer and simple arithmetic coder assumed.
  2. Optimization Using ML Tools
    • Loss function: rate + distortion trade-off.
    • Stochastic gradient descent for parameter optimization.
  3. Training Techniques
    • During training: replace quantization with uniform noise addition for continuity.
    • Adjust probability distributions used in coding.

Models and Real-World Applications

  • Scalar Compression Example: Demonstrates learning non-linear transform and optimal representation.
  • 2D Case Study (Banana Distribution): Constrained vs. non-constrained transforms show improved performance and flexibility.
  • Image Compression: Larger NNs (convolutions) used. Three layer depth for experiments.
  • Comparison Between Methods: JPEG, JPEG 2000, and nonlinear transform coder highlight improvements in naturalness and detail retention.

Representation Learning and Autoencoders

  • Definition: Easier extraction of useful info for classifiers or other predictors.
  • Autoencoders: Nonlinear transform + dimensionality reduction without entropy coding.
  • Variational Autoencoders: Combine autoencoders with variational Bayesian inference for generative modeling.
  • Comparisons: Example of ICA (subset of variational autoencoders).

Benchmarking and Performance

  • Metrics Evaluated: PSNR, MSSIM, MS error rate.
  • Comparison with HEVC Intra: Significant improvements using hyperprior models and direct metric optimization.
  • Subjective Tests: Demonstrated practical visual improvements.
  • Challenges: Computational expenses, needs for more optimized hardware, and faster execution.

Practical Considerations

  • GANs: Used for generating enhanced images but challenging to stabilize.
  • Practical Coding Efforts: TensorFlow Compression Library by Google released for public use.
  • Image Retrieval Application: Image classification via compressed representations.
  • Subsequent Challenges and Research Needs: Future improvements including model storage costs, scalability, application to other domains (audio, video), and better evaluation methods.

Conclusion

  • Exciting potential for further exploration in ML-based compression.
  • Necessity for integrating better quality metrics and practical hardware implementations.
  • Current acceleration techniques show promise but more work needed to meet practical utility.