#

model-compression

Here are 259 public repositories matching this topic...

changwoolee / BLAST

[NeurIPS 2024] BLAST: Block Level Adaptive Structured Matrix for Efficient Deep Neural Network Inference

matrix-factorization matrix-multiplication llama model-compression efficient-inference large-language-models

Updated Oct 2, 2024
Python

Picovoice / picollm

On-device LLM Inference Powered by X-Bit Quantization

natural-language-processing compression self-hosted llama language-models quantization language-model gemma mistral model-compression efficient-inference llm llms generative-ai large-language-model llm-inference llama2 mixtral llama3

Updated Oct 2, 2024
Python

AIoT-MLSys-Lab / SVD-LLM

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

model-compression efficient-models large-language-models generative-ai

Updated Oct 1, 2024
Python

asahi417 / lm-vocab-trimmer

Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting irrelevant tokens from its vocabulary. This repository contains a python-library vocabtrimmer, that remove irrelevant tokens from a multilingual LM vocabulary for the target language.

nlp gpt language-model bert model-compression t5

Updated Oct 3, 2024
Python

pvti / Awesome-Tensor-Decomposition

😎 A curated list of tensor decomposition resources for model compression.

svd tensor-decomposition low-rank-factorization singular-value-decomposition model-compression tensor-train tucker-decomposition tensor-ring-decomposition canonical-polyadic-decomposition

Updated Oct 1, 2024

alibaba / TinyNeuralNetwork

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.

deep-neural-networks deep-learning pytorch pruning model-compression model-converter quantization-aware-training post-training-quantization

Updated Oct 1, 2024
Python

VainF / Torch-Pruning

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs

pruning model-compression channel-pruning network-pruning structured-pruning efficient-deep-learning depgraph structural-pruning cvpr2023

Updated Sep 30, 2024
Python

hnuzhy / CV_DL_Gather

Gather research papers, corresponding codes (if having), reading notes and any other related materials about Hot🔥🔥🔥 fields in Computer Vision based on Deep Learning.

machine-learning deep-learning codes object-detection transfer-learning papers super-resolution knowledge-distillation action-recognition multi-object-tracking domain-adaptation model-compression single-person-pose-estimation multi-person-pose-estimation

Updated Sep 30, 2024

thu-nics / MoA

The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>

model-compression sparse-attention large-language-models

Updated Oct 1, 2024
Python

innerNULL / simpler-distil-whisper

Simpler Distil-Whisper

nlp deep-learning speech-recognition speech-to-text whisper asr model-compression model-training distil-whisper

Updated Sep 27, 2024
Python

czg1225 / SlimSAM

[NeurIPS 2024] SlimSAM: 0.1% Data Makes Segment Anything Slim

knowledge-distillation model-pruning model-compression segment-anything-model

Updated Sep 27, 2024
Python

minseok0809 / awesome-ai-paper

A curated list of awesome NLP, Computer Vision, Model Compression, XAI, Reinforcement Learning, Security etc Paper

nlp security natural-language-processing awesome ai computer-vision paper cv awesome-list arxiv quantization knowledge-distillation model-compression explainable-ai xai

Updated Sep 27, 2024
Jupyter Notebook

tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

machine-learning sparsity compression deep-learning tensorflow optimization keras ml pruning quantization model-compression quantized-training quantized-neural-networks quantized-networks

Updated Sep 25, 2024
Python

datawhalechina / awesome-compression

模型压缩的小白入门教程

compression quantization knowledge-distillation model-pruning prune model-compression neural-architecture-search kd model-quantization tinyml

Updated Sep 24, 2024

vtsouval / FedCode

Communication-Efficient Federated Learning via Transferring Codebooks

model-compression federated-learning communication-efficiency weight-clustering codebook-transfer

Updated Sep 24, 2024
Python

onnx / neural-compressor

Model compression for ONNX

deep-learning quantization model-pruning model-compression onnx onnxruntime

Updated Sep 23, 2024
Python

Zhen-Dong / Awesome-Quantization-Papers

List of papers related to neural network quantization in recent AI conferences and journals.

neural-networks awesome-list papers quantization model-compression edge-computing efficient-inference diffusion-models large-language-models

Updated Sep 22, 2024

cedrickchee / awesome-ml-model-compression

Awesome machine learning model compression research papers, quantization, tools, and learning material.

machine-learning neural-networks awesome-list pruning quantization model-compression

Updated Sep 21, 2024

Seonghak35 / R2KD

Robustness-Reinforced Knowledge Distillation with Correlation Distance and Network Pruning, IEEE Transactions on Knowledge and Data Engineering 2024

computer-vision knowledge-distillation data-augmentation model-compression network-pruning tkde correlation-distance

Updated Sep 20, 2024
Python

microsoft / Moonlit

This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.

model-compression neural-architecture-search inference-efficiency token-pruning

Updated Sep 17, 2024
Python

Improve this page

Add a description, image, and links to the model-compression topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the model-compression topic, visit your repo's landing page and select "manage topics."