casper-hansen / AutoAWQ Public

Notifications You must be signed in to change notification settings
Fork 176
Star 1.5k

Code
Issues 104
Pull requests 10
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: casper-hansen/AutoAWQ

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

104 Open 250 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Slowed Down After Quantizing Fine-Tuned gemme-2b-it Model

#572 opened Aug 5, 2024 by Yimjaehyun93

error when quantizing my finetuned 405b model using autoawq

#571 opened Aug 5, 2024 by Atomheart-Father

request: update prereq list to show supported python versions

#569 opened Aug 3, 2024 by AartBluestoke

about the shape of qzeros in awq quantization model

#566 opened Aug 1, 2024 by MuYu-zhi

ImportError: cannot import name 'initialize_tasks' from 'lm_eval.tasks'

#565 opened Aug 1, 2024 by kunzeng-ch

Support 3-bit and 2-bit quantization with the FLUTE kernel.

#564 opened Aug 1, 2024 by radi-cho

What‘s the difference between llm-awq and this？

#563 opened Aug 1, 2024 by LiMa-cas

Memory-efficient quantization: Load and quantize layer by layer

#561 opened Jul 30, 2024 by casper-hansen

cant import awq

#559 opened Jul 30, 2024 by Dujianhua1008

Quantitative model report wrong, RuntimeError: Expected all tensors to be on the same device

#558 opened Jul 28, 2024 by ShelterWFF

CUDA error: no kernel image is available for execution on the device

#557 opened Jul 25, 2024 by AragornHorse

awq quantization is not fully optimized yet. The speed can be slower than non-quantized models

#545 opened Jul 22, 2024 by jackNhat

Discussion on the selection of the calibration data set

#541 opened Jul 18, 2024 by beep-bebop

baichuan2-7B-Chat awq fuse_layer=True error

#539 opened Jul 12, 2024 by feipengheart

support of fake backend

#538 opened Jul 12, 2024 by yufenglee

GGUF export example hangs on AWQ step

#537 opened Jul 10, 2024 by thesyntaxinator

LookupError: unknown encoding: unicode

#536 opened Jul 10, 2024 by pankajshakya627

raise Exception (the loss increases to NAN ) when quantilizing DeepSeek-V2-chat using the new version of AutoAWQ in the sub-iteration (18/60)

#535 opened Jul 9, 2024 by BinFuPKU

Gemma2 Support

#529 opened Jul 1, 2024 by yc-wang00

Lora Adapters Support

#527 opened Jun 28, 2024 by vladrad

I encountered the following problem during the KL assessment

#526 opened Jun 28, 2024 by xieziyi881

Calibration Dataset: how to avoid computing loss on instructions?

#525 opened Jun 28, 2024 by RanchiZhao

Version on PyPi doesn't support Python 3.12

#521 opened Jun 26, 2024 by horsten

Same AWQ model behaves differently on two similar machines

#518 opened Jun 22, 2024 by Alf-Z-SymphoMe

Is it possible to quantize MoE models?

#512 opened Jun 20, 2024 by jackjiao12

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly