MLIC
[ACMMM 2023 / NCW ICML 2023] Multi-Reference Entropy Models for Learned Image Compression
Install / Use
/learn @JiangWeibeta/MLICREADME
MLIC Series [ACMMM 2023 / NCW ICML 2023]
This repo contains the official implementation of MLIC <sup> ++ </sup>.
We highlight MLIC <sup> ++ </sup>, which sloves the quadratic complexity of global context capturing!
MLIC: Multi-Reference Entropy Model for Learned Image Compression [Arxiv] [ACMDL] is accepted at ACMMM 2023 !
MLIC <sup> ++ </sup>: Linear Complexity Multi-Reference Entropy Modeling for Learned Image Compression [Arxiv] [OpenReview] is accepted at ICML 2023 Neural Compression Workshop !
- Compared with version presented at Neural Compression Workshop, ICML 2023 at OpenReview, in the latest arxiv version, we add the details of our prior work presented at ACMMM 2023, new comparisons on complexity and more ablation studies Arxiv.
For the versions of papers, we recommand the latest arxiv version.
We also release the MLIC-Train-100K dataset on HuggingFace.
<a href="https://star-history.com/#JiangWeibeta/MLIC&Date"> <img src="https://api.star-history.com/svg?repos=JiangWeibeta/MLIC&type=Date" width="70%" alt="Star History Chart"> </a>Architectures

Performance
Benchmark
| | Kodak | Tecnick | CLIC Pro Valid | |:--------:|:--------:|:------:|:--------:| | VTM-17.0 Intra | 0.00 | 0.00 | 0.00 | | STF (CVPR'22) | -2.48 | -2.75 | +0.42 | | WACNN (CVPR'22) | -2.95 | - | +0.04 | | ELIC (CVPR'22) | -5.95 | - |-| | LIC-TCM Large (CVPR'23) | -10.14 | -11.47 |-8.04 | | MLIC (ACMMM'23) | -8.05 | -12.73 |-8.79 | | MLIC+ (ACMMM'23) | -11.39 | -16.38 |-12.56| | MLIC++ (NCW ICML'23) | -13.39 | -17.59 |-13.08|
Pretrained Models
Update 2024-04-08
I upload the training log when the lambda is 0.0250. The model is trained on 4 GPU cards with ddp support.
I fix the LatentResidualPrediction and SynthesisTransform, you should use LatentResidualPrediction and SynthesisTransform instead of LatentResidualPredictionOld and SynthesisTransformOld. The parameter number of MLIC <sup> ++ </sup> becomes 83.5M. The modification leads to no performance drop.
Update checkpoint: https://disk.pku.edu.cn/link/AABED8912D2502477EB37C18FC7F2B2612
code: ujrv
Google Drive: https://drive.google.com/file/d/1FWPezuHLTQhDmEhShViI3XOSXA5u_Bya/view?usp=sharing
Old Weights (2023-09)
To use old weights (although the checkpoints below are named with 'new' since I fixed a bug at September 2023), you should use LatentResidualPredictionOld and SynthesisTransformOld.
<div class="center">| Lambda | Metric | Link | Lambda | Metric | Link | |:--------:|:--------:|:------:|:--------:|:--------:|:------:| | 0.0018 | MSE | PKUDisk, GoogleDrive |2.4 |MS-SSIM|PKUDisk, GoogleDrive | | 0.0035 | MSE | PKUDisk, GoogleDrive |4.58|MS-SSIM|PKUDisk, GoogleDrive | | 0.0067 | MSE | PKUDisk, GoogleDrive |8.73|MS-SSIM|PKUDisk, GoogleDrive | | 0.0130 | MSE | PKUDisk, GoogleDrive |16.64|MS-SSIM|PKUDisk, GoogleDrive | | 0.0250 | MSE | PKUDisk, GoogleDrive |31.73|MS-SSIM|PKUDisk, GoogleDrive | | 0.0483 | MSE | PKUDisk, GoogleDrive |60.5|MS-SSIM| PKUDisk, GoogleDrive |
</div>The structure of the provided weights is
{
"epoch": epoch + 1,
"state_dict": net.state_dict(),
"loss": loss,
"optimizer": optimizer.state_dict(),
"aux_optimizer": aux_optimizer.state_dict(),
"lr_scheduler": lr_scheduler.state_dict(),
}
Training
Settings
We train each model on a single Tesla A100 GPU. The batch size is set to $32$. The initial patch size is set to $256\times 256$. We set the patch size to $512\times 512$ after $1.2$ M steps.
Training Set
Training list is provided. These images are from DIV2K, Flicker2K, CLIC Train, COCO, ImageNet. Most JPG images are downsampled and the downsampled images are stored in PNG format. We use following function from PIL to downsample images.
img.resize((new_width, new_height), Image.ANTIALIAS)
The training set is available on HuggingFace.
Command
Example command is provided Here.
Testing
Example command is provided Here.
Environment
CompressAI 1.2.0b3
Pytorch 2.0.1
Contact
If you have any questions about MLIC, please contact Wei Jiang ( wei.jiang1999@outlook.com or jiangwei@stu.pku.edu.cn )
Citation
If you find our papers and this repo useful, kindly cite:
MLIC
@inproceedings{jiang2023mlic,
title={MLIC: Multi-Reference Entropy Model for Learned Image Compression},
author={Jiang, Wei and Yang, Jiayu and Zhai, Yongqi and Ning, Peirong and Gao, Feng and Wang, Ronggang},
doi = {10.1145/3581783.3611694},
booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
pages={7618--7627},
year={2023}
}
MLIC <sup> ++ </sup>
@article{jiang2025mlicpp,
title={MLIC++: Linear complexity multi-reference entropy modeling for learned image compression},
author={Jiang, Wei and Yang, Jiayu and Zhai, Yongqi and Gao, Feng and Wang, Ronggang},
journal={ACM Transactions on Multimedia Computing, Communications and Applications},
volume={21},
number={5},
pages={1--25},
year={2025}
}
MLICv2
@article{jiang2025mlicv2,
author={Jiang, Wei and Zhai, Yongqi and Yang, Jiayu and Gao, Feng and Wang, Ronggang},
title={MLICv2: Enhanced Multi-Reference Entropy Modeling for Learned Image Compression},
year={2025},
doi={10.1145/3785671},
journal={ACM Transactions on Multimedia Computing, Communications and Applications}
}
Related Skills
node-connect
347.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
