CSE - IIT Kanpur

CS 698D: Topics in Data Compression

Course Contents

This course covers the introductory basics and some recent topics in the area of lossless data compression. We will cover the following topics. First, we introduce the notion of a lossless compressor. We will cover the fundamental techniques like Shannon-Fano coding, Huffman coding and Arithmetic codes. We will cover Kolmogorov complexity as the "limit notion" of a lossless code, where the compressor can be any arbitrary program.

Special emphasis will be given to the class of "universal codes", including the Lempel-Ziv algorithm and the Burrows-Wheeler transform. We then study the optimality and rate of convergence results of the Lempel-Ziv algorithm, in the weak sense, and in the strong sense.

We then cover some current topics in the topic of universal codes.
Specifically,the class of universal codes has been shown to be "non-rob ust" with respect to small perturbations. We will study the tools used to establish these results, and investigate some open problems in this area.

Finally,we finish with some applications of the Lempel-Ziv a lgorithm to areas like Statistical inference, and algorithms.
Mathematical techniques used in these results include recurrence theorems from ergodic theory, and "Rokhlin-Kakutani tower constructions" from Ergodic theory (also known as "cutting and stacking".

If time permits, we will consider the recent class of optimal compressors called "Asymmetric Numeral Systems", which were introduced in 2014.

Books and References

"Elements of Information Theory" by Cover and Thomas [5]
"Information Theory" by Csizar and Korner [6]
"Text Compression" by Bell, Cleary and W itten[7]

CS 698D: Topics in Data Compression

CS 698D: Topics in Data Compression

Course Contents

Books and References

People

Resources

Programs

Admissions

Department

Research