Also the official open-source implementation of our work OneVAE.
📄 Paper: OneVAE: Joint Discrete and Continuous Optimization Helps Discrete VAE Train Better
Key Contributions:
- Multiple Structural Improvements — Introduces several architecture-level enhancements for discrete VAE to boost reconstruction quality under high compression.
- Progressive Training with Pretrained Continuous VAE — Initializes from a high-quality pretrained continuous VAE and gradually transitions to discrete VAE, effectively leveraging strong priors.
- Unified Model — Achieves superior performance on both continuous and discrete representations within a single model.
In addition to releasing the code of this work, we aim to provide a unified repository that supports fine-tuning and training of multiple pretrained VAE models, enabling the community to better adapt VAEs to their specific needs.
We are actively organizing and refining the codebase, and ⚡ most features and resources will be released within two weeks!
| Model Name | Encoding Method | Compression Ratio | Download Link |
|---|---|---|---|
| OneVAE | Discrete, Multi-Token Quant = 2 | 8 x 16 x 16 | Link |
| OneVAE | Discrete, Multi-Token Quant = 2 | 16 x 16 x 16 | Link |
| OneVAE | Discrete, Multi-Token Quant = 2 | 8 x 8 x 8 | Link |
| Video1 | Video2 |
|---|---|
Continue_Video_Result.mp4 |
Discrete_comparison.mp4 |
| Video1 | Video2 | Video3 |
|---|---|---|
sample_0.mp4 |
sample_1.mp4 |
sample_2.mp4 |
- FluxVAE
- LlamaGen
- SD-VAE
- OneVAE (ours)
- WanVAE (Alibaba)
- HunyuanVideo VAE (Tencent)
- Release model code (to be completed within two weeks)
- Provide pretrained weights download links
- Support additional types of VAE models
The code is licensed under the Apache License 2.0. When using our repository to fine-tune other models, you must comply with the licenses of the respective pretrained models.