Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs / Quantization: refactor quantization documentation #30942

Merged
merged 20 commits into from
May 23, 2024

Conversation

younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented May 21, 2024

What does this PR do?

As per title, this PR refactors the quantization documentation to make it clearer, less aggressive to users and simple to understand, mainly about which quantization method to use when - still WIP

cc @SunMarc @stevhliu @Titus-von-Koeller

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@younesbelkada younesbelkada marked this pull request as ready for review May 22, 2024 08:21
Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work @younesbelkada. Thanks for refactoring the docs, so that the users can choose better which quantization method to use !

docs/source/en/quantization/overview.md Outdated Show resolved Hide resolved
Comment on lines +19 to +20
Quantization techniques focus on representing data with less information while also trying to not lose too much accuracy. This often means converting a data type to represent the same information with fewer bits. For example, if your model weights are stored as 32-bit floating points and they're quantized to 16-bit floating points, this halves the model size which makes it easier to store and reduces memory-usage. Lower precision can also speedup inference because it takes less time to perform calculations with fewer bits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For those who are interested in learning more about quantization, do you think we can put the links to the DLAI course ?

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! 🔥

Makes a lot of sense to create separate pages for each method especially if the community keeps adding new quantization methods!

docs/source/en/_toctree.yml Outdated Show resolved Hide resolved
docs/source/en/quantization/overview.md Outdated Show resolved Hide resolved
docs/source/en/quantization/overview.md Outdated Show resolved Hide resolved
docs/source/en/quantization/overview.md Outdated Show resolved Hide resolved
docs/source/en/quantization/overview.md Outdated Show resolved Hide resolved
docs/source/en/quantization/bitsandbytes.md Outdated Show resolved Hide resolved
docs/source/en/quantization/awq.md Outdated Show resolved Hide resolved
docs/source/en/quantization/awq.md Outdated Show resolved Hide resolved
docs/source/en/quantization/aqlm.md Outdated Show resolved Hide resolved
docs/source/en/quantization/aqlm.md Show resolved Hide resolved
@younesbelkada younesbelkada merged commit 87a3518 into main May 23, 2024
8 checks passed
@younesbelkada younesbelkada deleted the refactor-quantizationd-docs branch May 23, 2024 12:31
itazap pushed a commit that referenced this pull request May 24, 2024
* refactor quant docs

* delete file

* rename to overview

* fix

* fix table

* fix

* add content

* fix library versions

* fix table

* fix table

* fix table

* fix table

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* replace to quantization_config

* fix aqlm snippet

* add DLAI courses

* fix

* fix table

* fix bulet points

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
@LysandreJik
Copy link
Member

In this PR you deleted the quantization.md file, doing so means that all existing links will redirect to a non-existing file (or to the previous file as long as it's cached).

You should update the following file to ensure that users coming from other places don't get redirected to an empty/outdated file: https://github.com/huggingface/transformers/blob/main/docs/source/en/_redirects.yml

cc @younesbelkada

@LysandreJik
Copy link
Member

We likely want to redirect to the overview.md file

@younesbelkada
Copy link
Contributor Author

Thanks for the heads up ! Done in #31063

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Jun 11, 2024
…#30942)

* refactor quant docs

* delete file

* rename to overview

* fix

* fix table

* fix

* add content

* fix library versions

* fix table

* fix table

* fix table

* fix table

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* replace to quantization_config

* fix aqlm snippet

* add DLAI courses

* fix

* fix table

* fix bulet points

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants