update docs

calad0i · Apr 26, 2024 · c5b7d39 · c5b7d39
1 parent a5f3a3f
commit c5b7d39
Show file tree

Hide file tree

Showing 9 changed files with 54 additions and 28 deletions.
diff --git a/README.md b/README.md
@@ -8,14 +8,28 @@
 [![PyPI version](https://badge.fury.io/py/hgq.svg)](https://badge.fury.io/py/hgq)
 
 
-HGQ is a framework for quantization aware training of neural networks to be deployed on FPGAs, which allows for per-weight and per-activation bitwidth optimization.
+HGQ is an gradient-based automatic bitwidth optimization and quantization-aware training algorithm for neural networks to be deployed on FPGAs, By laveraging gradients, it allows for bitwidth optimization at arbitrary granularity, up to per-weight and per-activation level.
 
-Depending on the specific [application](https://arxiv.org/abs/2006.10159), HGQ could achieve up to 10x resource reduction compared to the traditional `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging [tasks](https://arxiv.org/abs/2202.04976), where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming not too soon).
+<img src="docs/_static/overview.svg" alt="HGQ-overview" width="600"/>
 
-This repository implements HGQ for `tensorflow.keras` models. It is independent of the [QKeras project](https://github.com/google/qkeras).
+Compare to the other heterogeneous quantization approach, like the QKeras counterpart, HGQ provides the following advantages:
 
-## Warning:
+- **High Granularity**: HGQ supports per-weight and per-activation bitwidth optimization, or any other lower granularity.
+- **Automatic Quantization**: By setting a resource regularization term, HGQ could automatically optimize the bitwidth of all parameters during training. Pruning is performed naturally when a bitwidth is reduced to 0.
+- **Bit-accurate conversion** to `hls4ml`: You get exactly what you get from `Keras` models from `hls4ml` models. HGQ provides a bit-accurate conversion interface, proxy models, for bit-accurate conversion to hls4ml models.
+  - still subject to machine float precision limitation.
+- **Accurate Resource Estimation**: BOPs estimated by HGQ is roughly #LUTs + 55#DSPs for actual (post place & route) FPGA resource consumption. This metric is available during training, and one can estimate the resource consumption of the final model in a very early stage.
 
-This framework requires an **unmerged** [PR](https://github.com/fastmachinelearning/hls4ml/pull/914) of hls4ml. Please install it by running `pip install "git+https://github.com/calad0i/hls4ml@HGQ-integration"`. Or, conversion will fail with unsupported layer error.
+Depending on the specific [application](https://arxiv.org/abs/2006.10159), HGQ could achieve up to 20x resource reduction compared to the `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging [tasks](https://arxiv.org/abs/2202.04976), where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming soon).
 
-## This package is still under development. Any API might change without notice at any time!
+## Installation
+
+You will need `python>=3.10` and `tensorflow>=2.13` to run this framework. You can install it via pip:
+
+```bash
+pip install hgq
+```
+
+## Usage
+
+Please refer to the [documentation](https://calad0i.github.io/HGQ/) for more details.
diff --git a/docs/_static/custom.css b/docs/_static/custom.css
@@ -0,0 +1,3 @@
+img.light {
+    color-scheme: light;
+}
diff --git a/docs/_static/overview.svg b/docs/_static/overview.svg
diff --git a/docs/conf.py b/docs/conf.py
@@ -66,3 +66,7 @@
 html_theme = "sphinx_rtd_theme"
 html_static_path = ['_static']
 html_favicon = '_static/icon.svg'
+
+html_css_files = [
+    'custom.css',
+]
diff --git a/docs/faq.md b/docs/faq.md
@@ -6,7 +6,7 @@ HGQ is a method for quantization aware training of neural works to be deployed o
 
 ## Why is it useful?
 
-Depending on the specific [application](https://arxiv.org/abs/2006.10159), HGQ could achieve up to 10x resource reduction compared to the traditional `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging [tasks](https://arxiv.org/abs/2202.04976), where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming not too soon).
+Depending on the specific [application](https://arxiv.org/abs/2006.10159), HGQ could achieve up to 20x resource reduction compared to the traditional `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging [tasks](https://arxiv.org/abs/2202.04976), where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming not too soon).
 
 ## Can I use it?
 

diff --git a/docs/getting_started.md b/docs/getting_started.md
@@ -1,7 +1,7 @@
 # Quick Start
 
-```{warning}
-This guide is only for models with fully heterogeneous quantized weights. For models with partially-heterogeneous quantized weights, please refer to the [Full Usage](#Full Usage) guide.
+```{note}
+This guide is only for models with fully heterogeneous quantized weights (per-weight bitwidth).
 ```
 
 ## Model definition & training

diff --git a/docs/index.rst b/docs/index.rst
@@ -3,21 +3,33 @@
    You can adapt this file completely to your liking, but it should at least
    contain the root `toctree` directive.
 
+===========================
 High Granularity Quantization
-===============================================================
+===========================
 
-HGQ is a framework for quantization aware training of neural networks to be deployed on FPGAs, which allows for per-weight and per-activation bitwidth optimization.
+.. image:: https://img.shields.io/badge/license-Apache%202.0-green.svg
+   :target: LICENSE
+.. image:: https://github.com/calad0i/HGQ/actions/workflows/sphinx-build.yml/badge.svg
+   :target: https://calad0i.github.io/HGQ/
+.. image:: https://badge.fury.io/py/hgq.svg
+   :target: https://badge.fury.io/py/hgq
 
-Depending on the specific application_, HGQ could achieve up to 10x resource reduction compared to the traditional AutoQkeras_ approach, while maintaining the same accuracy. For some more `challenging tasks`_, where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming not too soon).
+HGQ is an gradient-based automatic bitwidth optimization and quantization-aware training algorithm for neural networks to be deployed on FPGAs, By laveraging gradients, it allows for bitwidth optimization at arbitrary granularity, up to per-weight and per-activation level.
 
-This repository implements HGQ for `tensorflow.keras` models. It is independent of the `QKeras project`_.
+.. rst-class:: light
+.. image:: _static/overview.svg
+   :alt: HGQ-overview
+   :width: 600
 
-Notice: this repository is still under development, and the API might change in the future.
+Compare to the other heterogeneous quantization approach, like the QKeras counterpart, HGQ provides the following advantages:
 
-.. _application: https://arxiv.org/abs/2006.10159
-.. _AutoQkeras: https://arxiv.org/abs/2006.10159
-.. _challenging tasks: https://arxiv.org/abs/2202.04976
-.. _QKeras project: https://github.com/google/qkeras
+- **High Granularity**: HGQ supports per-weight and per-activation bitwidth optimization, or any other lower granularity.
+- **Automatic Quantization**: By setting a resource regularization term, HGQ could automatically optimize the bitwidth of all parameters during training. Pruning is performed naturally when a bitwidth is reduced to 0.
+- **Bit-accurate conversion** to `hls4ml`: You get exactly what you get from `Keras` models from `hls4ml` models. HGQ provides a bit-accurate conversion interface, proxy models, for bit-accurate conversion to hls4ml models.
+  - still subject to machine float precision limitation.
+- **Accurate Resource Estimation**: BOPs estimated by HGQ is roughly #LUTs + 55#DSPs for actual (post place & route) FPGA resource consumption. This metric is available during training, and one can estimate the resource consumption of the final model in a very early stage.
+
+Depending on the specific `application <https://arxiv.org/abs/2006.10159>`_, HGQ could achieve up to 20x resource reduction compared to the `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging `tasks <https://arxiv.org/abs/2202.04976>`_, where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming soon).
 
 Index
 =========================================================

diff --git a/docs/install.md b/docs/install.md
@@ -1,19 +1,11 @@
 # Installation
 
-Use `pip install --pre HGQ` to install the latest version from PyPI. You will need a environment with `python>=3.10` installed. Currently, only `python3.10 and 3.11` are tested.
+Use `pip install HGQ` to install the latest version from PyPI. You will need a environment with `python>=3.10` installed. Currently, only `python3.10 and 3.11` are tested.
 
 ```{warning}
 This framework requires an **unmerged** [PR](https://github.com/fastmachinelearning/hls4ml/pull/914) of hls4ml. Please install it by running `pip install "git+https://github.com/calad0i/hls4ml@HGQ-integration"`. Or, conversion will fail with unsupported layer error.
 ```
 
-```{note}
-The current varsion requires an **unmerged** version of hls4ml. Please install it by running `pip install git+https://github.com/calad0i/hls4ml`.
-```
-
 ```{warning}
 HGQ v0.2 requires `tensorflow>=2.13,<2.16` (tested on 2.13 and 2.15; 2.16 untested but may work) and `python>=3.10`. Please make sure that you have the correct version of python and tensorflow installed.
 ```
-
-```{warning}
-Due to broken dependency declaration, you will need to specify the version of tensorflow manually. Otherwise, there will likely to be version conflicts.
-```
diff --git a/docs/reference.md b/docs/reference.md
@@ -53,7 +53,7 @@ Heterogenerous layers (`H-` prefix):
 - (New in 0.2) `HActivation` with **arbitrary unary function**. (See note below.)
 
 ```{note}
-`HActivation` will be converted to a general `unary LUT` in `to_proxy_model` when
+`HActivation` will be converted to a general `unaryLUT` in `to_proxy_model` when
  - the required table size is smaller or equal to `unary_lut_max_table_size`.
  - the corresponding function is not `relu`.