Skip to content

Commit

Permalink
[Cherry Picks] Analyze Bug Fixes (Updated) (#465)
Browse files Browse the repository at this point in the history
* `RegistryMixin` improved alias management (#404)

* initial commit

* add docstrings

* simplify

* hardening

* refactor

* format registry lookup strings to be lowercases

* standardise aliases

* Move evaluator registry (#411)

* More control over external data size (#412)

* When splitting external data, avoid renaming `model.data` to `model.data.1` if only one external data file gets eventually saved (#414)

* [model.download] fix function returning nothing (#420)

* [BugFix] Path not expanded (#418)

* [Fix] Allow for processing Path in the sparsezoo analysis (#417)

* Raise TypeError instead of ValueError (#426)

* Fix misleading docstring (#416)

Add test

* add support for benchmark.yaml (#415)

* add support for benchmark.yaml

recent zoo models use `benchmark.yaml` instead of `benchmarks.yaml`. adding this additional pathway so `benchmark.yaml` is downloaded in the bulk model download

* update files filter

* fix tests

---------

Co-authored-by: dbogunowicz <damian@neuralmagic.com>

* [BugFix] Add analyze to init (#421)

* Add analyze to init

* Move onnxruntime to deps

* Print model analysis (#423)

* [model.download] fix function returning nothing (#420)

* [BugFix] Path not expanded (#418)

* print model-analysis

* [Fix] Allow for processing Path in the sparsezoo analysis (#417)

* add print statement at the end of cli run

---------

Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Co-authored-by: Rahul Tuli <rahul@neuralmagic.com>
Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com>

* Omit scalar weight (#424)

* ommit scalar weights:

* remove unwanted files

* comment

* Update src/sparsezoo/utils/onnx/analysis.py

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

---------

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

---------

Co-authored-by: George <george@neuralmagic.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com>
Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

* update analyze help message for correctness (#432)

* initial commit (#430)

* [sparsezoo.analyze] Fix pathway such that it works for larger models (#437)

* fix analyze to work with larger models

* update for failing tests; add comments

* Update src/sparsezoo/utils/onnx/external_data.py

Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com>

---------

Co-authored-by: Dipika Sikka <dipikasikka1@gmail.coom>
Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com>

* Delete hehe.py (#439)

* Download deployment dir for llms (#435)

* Download deployment dir for llms

* Use path instead of download

* only set save_as_external_data to true if the model originally had external data (#442)

* Add Channel Wise Quantization Support (#441)

* Chunk download (#429)

* chunk download, break down into 10

* lint

* threads download

* draft

* chunk download draft

* job based download and combining/deleteing chunks

* delete old code

* lint

* fix num jobs if file_size is less than the chunk size

* doc string and return types

* test

* lint

* fix type hints (#445)

* fix bug if the value is a dict (#447)

* [deepsparse.analyze] Fix v1 functionality to  work with llms (#451)

* fix equivalent changes made to analyze_v2 such that inference session works for llms; update wanrings to be debug printouts

* typo

* overwrite file (#450)

Co-authored-by: 21 <a21@21s-MacBook-Pro.local>

* Adds a `numpy_array_representer` to yaml (#454)

on runtime, to avoid serialization issues

* Avoid division by zero (#457)

Avoid log of zero

* op analysis total counts had double sparse counts (#461)

* Rename legacy analyze to analyze_v1 (#459)

* Fixing Quant % Calcuation (#462)

* initial fix

* style

* Include Sparsity in Size Calculation (#463)

* initial fix

* style

* incorporate sparsity into size calculation

* quality

* op analysis total counts had double sparse counts (#461)

* Fixing Quant % Calcuation (#462)

* initial fix

* style

* Include Sparsity in Size Calculation (#463)

* initial fix

* style

* incorporate sparsity into size calculation

* quality

* Revert "Merge branch 'main' into analyze_cherry_picks"

This reverts commit 509fa1a, reversing
changes made to 08f94c4.

---------

Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com>
Co-authored-by: Rahul Tuli <rahul@neuralmagic.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
Co-authored-by: dbogunowicz <damian@neuralmagic.com>
Co-authored-by: George <george@neuralmagic.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.coom>
Co-authored-by: 21 <a21@21s-MacBook-Pro.local>
  • Loading branch information
9 people committed Feb 22, 2024
1 parent 44b7972 commit edf177e
Show file tree
Hide file tree
Showing 6 changed files with 58 additions and 39 deletions.
13 changes: 8 additions & 5 deletions src/sparsezoo/analyze_v2/memory_access_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def get_quantization(self) -> List["QuantizationAnalysisSchema"]:
:returns: List of quantization analysis pydantic models for each grouping
if the node has weights
"""
data = get_memeory_access_bits(self.model_graph, self.node, self.node_shape)
data = get_memory_access_bits(self.model_graph, self.node, self.node_shape)
if data is not None:
quantization_analysis_model = []
for grouping, counts_dict in data.items():
Expand Down Expand Up @@ -152,7 +152,7 @@ def get_memory_access_counts(
}


def get_memeory_access_bits(
def get_memory_access_bits(
model_graph: ONNXGraph,
node: NodeProto,
node_shape: Dict,
Expand All @@ -164,12 +164,15 @@ def get_memeory_access_bits(
)
node_weight = get_node_weight(model_graph, node)
precision = get_numpy_quantization_level(node_weight)
bits = memory_access_counts["single"]["counts"] * precision
bits_quant = bits * is_quantized_layer(model_graph, node)
counts = memory_access_counts["single"]["counts"]
bits = counts * precision
is_quantized = is_quantized_layer(model_graph, node)

return {
"tensor": {
"bits": bits,
"bits_quant": bits_quant,
"bits_quant": bits * is_quantized,
"counts": counts,
"counts_quant": counts * is_quantized,
}
}
14 changes: 7 additions & 7 deletions src/sparsezoo/analyze_v2/model_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,10 @@ def calculate_sparsity_percentage(self, category: Dict):
counts = category["counts"]
return (counts_sparse / counts) * 100 if counts != 0 else 0

def calculate_quantized_percentage(self, tensor: Dict):
bits_quant = tensor["bits_quant"]
bits = tensor["bits"]
return (bits_quant / bits) * 100 if bits != 0 else 0
def calculate_quantized_percentage(self, tensor: Dict, counts_prefix: str):
counts_quant = tensor[f"{counts_prefix}_quant"]
counts = tensor[counts_prefix]
return (counts_quant / counts) * 100 if counts != 0 else 0

def __repr__(self):
data = self.to_dict()
Expand All @@ -93,7 +93,7 @@ def __repr__(self):
)
param_size = summaries["params"]["quantization"]["tensor"]["bits"]
param_quantized = self.calculate_quantized_percentage(
summaries["params"]["quantization"]["tensor"]
summaries["params"]["quantization"]["tensor"], "counts"
)

ops_total = summaries["ops"]["sparsity"]["single"]["counts"]
Expand All @@ -102,7 +102,7 @@ def __repr__(self):
)
ops_size = summaries["ops"]["quantization"]["tensor"]["bits"]
ops_quantized = self.calculate_quantized_percentage(
summaries["ops"]["quantization"]["tensor"]
summaries["ops"]["quantization"]["tensor"], "counts"
)

mem_access_total = summaries["mem_access"]["sparsity"]["single"]["counts"]
Expand All @@ -111,7 +111,7 @@ def __repr__(self):
)
mem_access_size = summaries["mem_access"]["quantization"]["tensor"]["bits"]
mem_access_quantized = self.calculate_quantized_percentage(
summaries["mem_access"]["quantization"]["tensor"]
summaries["mem_access"]["quantization"]["tensor"], "counts"
)

return (
Expand Down
27 changes: 14 additions & 13 deletions src/sparsezoo/analyze_v2/operation_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -166,22 +166,23 @@ def get_operation_bits(
precision = get_numpy_quantization_level(node_weight)
is_quantized_op = "32" not in str(precision)

bits = (
ops["single"]["counts"] + ops["single"]["counts_sparse"]
) * precision

bits_block4 = (
ops["block4"]["counts"] + ops["block4"]["counts_sparse"]
) * precision

bits_quant = is_quantized_op * bits
single_counts = ops["single"]["counts"]
single_counts_sparse = ops["single"]["counts_sparse"]
single_bits = (single_counts - single_counts_sparse) * precision
block4_counts = ops["block4"]["counts"]
block4_counts_sparse = ops["block4"]["counts_sparse"]
block4_bits = (block4_counts - block4_counts_sparse) * precision
return {
"tensor": {
"bits": bits,
"bits_quant": bits_quant,
"counts": single_counts,
"counts_quant": is_quantized_op * single_counts,
"bits": single_bits,
"bits_quant": is_quantized_op * single_bits,
},
"block4": {
"bits": bits_block4,
"bits_quant": bits_quant,
"counts": block4_counts,
"counts_quant": is_quantized_op * block4_counts,
"bits": block4_bits,
"bits_quant": is_quantized_op * block4_bits,
},
}
17 changes: 10 additions & 7 deletions src/sparsezoo/analyze_v2/parameter_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
get_node_num_four_block_zeros_and_size,
get_node_param_counts,
get_node_weight,
get_node_weight_bits,
get_node_weight_precision,
get_numpy_distribution_statistics,
get_numpy_entropy,
get_numpy_modes,
Expand Down Expand Up @@ -153,14 +153,17 @@ def get_parameter_bits(
If the layer is quantized, assume all its elements in the ndarray
are quantized
"""
node_weight = get_node_weight(model_graph, node)
if node_weight is not None and node_weight.size > 0:
bits = get_node_weight_bits(model_graph, node)

num_weights, num_bias, num_sparse_weights = get_node_param_counts(node, model_graph)
if num_weights > 0:
precision = get_node_weight_precision(model_graph, node)
is_quantized = is_quantized_layer(model_graph, node)
num_non_sparse_weights = num_weights - num_sparse_weights + num_bias
return {
"tensor": {
"bits": bits,
"bits_quant": bits * is_quantized_layer(model_graph, node),
"counts": num_weights,
"counts_quant": num_weights * is_quantized,
"bits": num_non_sparse_weights * precision,
"bits_quant": num_non_sparse_weights * precision * is_quantized,
},
}

Expand Down
18 changes: 15 additions & 3 deletions src/sparsezoo/analyze_v2/schemas/quantization_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,14 @@


class QuantizationSummaryAnalysisSchema(BaseModel):
counts: float = Field(..., description="Total number of weights")
counts_quant: int = Field(
...,
description=(
"Total number of quantized weights."
"Here we assume if the layer is quantized, the entire array is quantized"
),
)
bits: float = Field(..., description="Total bits required to store the weights")
bits_quant: int = Field(
...,
Expand All @@ -39,9 +47,9 @@ def validate_types(cls, value):
@validator("percent", pre=True, always=True)
def calculate_percent_if_none(cls, value, values):
if value is None:
bits = values.get("bits", 0)
bits_quant = values.get("bits_quant", 0)
return bits_quant / bits if bits > 0 else 0.0
counts = values.get("counts", 0)
counts_quant = values.get("counts_quant", 0)
return counts_quant / counts if counts > 0 else 0.0
return value

def __add__(self, model: BaseModel):
Expand All @@ -51,7 +59,9 @@ def __add__(self, model: BaseModel):

if validator_model is not None:
return validator_model(
counts=self.counts + model.counts,
bits=self.bits + model.bits,
counts_quant=self.counts_quant + model.counts_quant,
bits_quant=self.bits_quant + model.bits_quant,
)

Expand All @@ -67,6 +77,8 @@ def __add__(self, model: BaseModel):
if validator_model is not None and self.grouping == model.grouping:
return validator_model(
grouping=self.grouping,
counts=self.counts + model.counts,
bits=self.bits + model.bits,
counts_quant=self.counts_quant + model.counts_quant,
bits_quant=self.bits_quant + model.bits_quant,
)
8 changes: 4 additions & 4 deletions src/sparsezoo/utils/onnx/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
"get_numpy_distribution_statistics",
"get_numpy_quantization_level",
"get_numpy_bits",
"get_node_weight_bits",
"get_node_weight_precision",
"get_node_param_counts",
"get_node_kernel_shape",
]
Expand Down Expand Up @@ -485,13 +485,13 @@ def get_node_param_counts(
return params, bias, sparse_params


def get_node_weight_bits(
def get_node_weight_precision(
model_graph: ONNXGraph,
node: NodeProto,
) -> int:
"""Get the bits needed to store the node weights"""
"""Get the precision of the node in number of bits"""
node_weight = get_node_weight(model_graph, node)
return get_numpy_bits(node_weight)
return get_numpy_quantization_level(node_weight)


def get_numpy_bits(arr: numpy.ndarray) -> int:
Expand Down

0 comments on commit edf177e

Please sign in to comment.