toncao

Follow

toncao toncao

Follow

[email protected]

cyankiwi

Achievements

Achievements

Popular repositories Loading

toncao toncao Public

Config files for my GitHub profile.
GPTQModel GPTQModel Public

Forked from ModelCloud/GPTQModel

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

Python
llm-compressor llm-compressor Public

Forked from vllm-project/llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python
AQLM AQLM Public

Forked from Vahe1994/AQLM

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…

Python
compressed-tensors compressed-tensors Public

Forked from vllm-project/compressed-tensors

A safetensors extension to efficiently store sparse quantized tensors on disk

Python
TensorRT-Model-Optimizer TensorRT-Model-Optimizer Public

Forked from NVIDIA/TensorRT-Model-Optimizer

A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment…

Python