Skip to content

HVision-NKU/OneVAE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OneVAE: Unified Repository for Continuous and Discrete VAE Training

Also the official open-source implementation of our work OneVAE.

📄 Paper: OneVAE: Joint Discrete and Continuous Optimization Helps Discrete VAE Train Better

Key Contributions:

  1. Multiple Structural Improvements — Introduces several architecture-level enhancements for discrete VAE to boost reconstruction quality under high compression.
  2. Progressive Training with Pretrained Continuous VAE — Initializes from a high-quality pretrained continuous VAE and gradually transitions to discrete VAE, effectively leveraging strong priors.
  3. Unified Model — Achieves superior performance on both continuous and discrete representations within a single model.
image

Development Status

In addition to releasing the code of this work, we aim to provide a unified repository that supports fine-tuning and training of multiple pretrained VAE models, enabling the community to better adapt VAEs to their specific needs.
We are actively organizing and refining the codebase, and ⚡ most features and resources will be released within two weeks!

Open Source Model

Model Name Encoding Method Compression Ratio Download Link
OneVAE Discrete, Multi-Token Quant = 2 8 x 16 x 16 Link
OneVAE Discrete, Multi-Token Quant = 2 16 x 16 x 16 Link
OneVAE Discrete, Multi-Token Quant = 2 8 x 8 x 8 Link

Visual Results

Video Gallery

Video1 Video2
Continue_Video_Result.mp4
Discrete_comparison.mp4

More Discrete Video Results on High-Compression VAE (4×16×16)

Video1 Video2 Video3
sample_0.mp4
sample_1.mp4
sample_2.mp4

Planned Supported Fine-Tuning

Image VAE

  • FluxVAE
  • LlamaGen
  • SD-VAE

Video VAE

  • OneVAE (ours)
  • WanVAE (Alibaba)
  • HunyuanVideo VAE (Tencent)

TODO

  • Release model code (to be completed within two weeks)
  • Provide pretrained weights download links
  • Support additional types of VAE models

LICENSE

The code is licensed under the Apache License 2.0. When using our repository to fine-tune other models, you must comply with the licenses of the respective pretrained models.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages