Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MLOps 1.5] Autogenerate data logging configs from the registry #867

Merged
merged 94 commits into from
Feb 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
4dfa663
initial commit
Jan 9, 2023
7a2ac8f
clean up the code
Jan 9, 2023
32dfa2d
ready for review
Jan 9, 2023
43eeb1b
add docstring
Jan 9, 2023
0268024
add end_to_end test
Jan 9, 2023
f3de235
add comment
Jan 9, 2023
22d8bde
Delete preview.py
dbogunowicz Jan 10, 2023
6bd58c9
ready to re-review
Jan 10, 2023
6af80eb
Merge branch 'main' into feature/damian/logging_in_pipeline
dbogunowicz Jan 10, 2023
fc18d58
Merge branch 'main' into feature/damian/logging_in_pipeline
dbogunowicz Jan 11, 2023
625ecc1
initial commit
Jan 11, 2023
592c2e4
Merge branch 'main' into feature/damian/refactor_function_logger
dbogunowicz Jan 11, 2023
dfb9dce
logic solid fixing tests now
Jan 12, 2023
1b0494a
Delete __init__.py
dbogunowicz Jan 12, 2023
051ac36
ready for next review
Jan 13, 2023
2d57d9a
Merge branch 'feature/damian/logging_in_pipeline' of https://github.c…
Jan 13, 2023
515db9e
remove old test
Jan 13, 2023
875e01c
Merge branch 'main' into feature/damian/logging_in_pipeline
dbogunowicz Jan 13, 2023
ab9e46c
hit the wall, need to merge with the pipeline loggers code
Jan 13, 2023
cf39db3
Merge remote-tracking branch 'origin/feature/damian/logging_in_pipeli…
Jan 13, 2023
a43559c
Merge remote-tracking branch 'origin/main' into feature/damian/refact…
Jan 13, 2023
5b14cb1
Merge branch 'main' into feature/damian/logging_in_pipeline
dbogunowicz Jan 13, 2023
12a7e7d
proposal
Jan 13, 2023
399e1c4
Merge branch 'feature/damian/logging_in_pipeline' into feature/damian…
dbogunowicz Jan 13, 2023
316b450
proposal finished
Jan 13, 2023
8465f06
proposal clean
Jan 13, 2023
456f42c
remove files
Jan 13, 2023
23d685c
Update src/deepsparse/server/config.py
dbogunowicz Jan 13, 2023
b46d01c
ready to land
Jan 13, 2023
ff8b4ba
Merge branch 'main' into feature/damian/logging_in_pipeline
dbogunowicz Jan 13, 2023
4f069a0
Merge remote-tracking branch 'origin/main' into feature/damian/refact…
Jan 16, 2023
46734e0
testing registry
Jan 16, 2023
d05c581
Merge remote-tracking branch 'origin/feature/damian/logging_in_pipeli…
Jan 16, 2023
7da1cfd
logic solid, time to beautify
Jan 16, 2023
8110cd2
resolving tests
Jan 16, 2023
3634223
ready to review the design
Jan 16, 2023
e47bbd7
fix small bug
Jan 16, 2023
58ca022
Merge remote-tracking branch 'origin/main' into feature/damian/refact…
Jan 17, 2023
53528f3
add unwrapping dictionaries
Jan 17, 2023
886a6db
Update src/deepsparse/loggers/build_logger.py
dbogunowicz Jan 17, 2023
3b1b670
Update src/deepsparse/loggers/build_logger.py
dbogunowicz Jan 17, 2023
9fccaff
Update src/deepsparse/loggers/build_logger.py
dbogunowicz Jan 17, 2023
7a9b02b
Update src/deepsparse/loggers/build_logger.py
dbogunowicz Jan 17, 2023
ee62db5
ready for re-review
Jan 18, 2023
9f10308
Merge branch 'main' into feature/damian/refactor_function_logger
dbogunowicz Jan 18, 2023
028000c
Update src/deepsparse/loggers/build_logger.py
dbogunowicz Jan 18, 2023
7821d9a
Apply suggestions from code review
dbogunowicz Jan 18, 2023
42e97cf
fixing tests
Jan 18, 2023
6c7d08c
slim down the PR
Jan 18, 2023
c2099e8
also remove test helpers
Jan 18, 2023
a655239
initial commit
Jan 19, 2023
e246b3c
Merge remote-tracking branch 'origin/feature/damian/refactor_function…
Jan 19, 2023
ea2a6a5
make the code a bit more readable
Jan 20, 2023
779e0d2
Merge branch 'main' into feature/damian/config_generation
dbogunowicz Jan 20, 2023
f82853d
Merge branch 'feature/damian/config_generation' of https://github.com…
Jan 20, 2023
0a88eb9
Merge remote-tracking branch 'origin/feature/damian/refactor_function…
Jan 20, 2023
5e5f3c9
add parsing from registry
Jan 20, 2023
7623152
Merge branch 'main' into feature/damian/refactor_function_logger
dbogunowicz Jan 20, 2023
1fb9b1b
[MLOps 1.5] Autogenerate data logging configs from the registry (help…
dbogunowicz Jan 24, 2023
8b2e8bb
ready to rock and roll
KSGulin Jan 24, 2023
e74d969
Merge branch 'feature/damian/refactor_function_logger' into feature/d…
dbogunowicz Jan 24, 2023
7b1697f
Merge branch 'main' into feature/damian/refactor_function_logger
dbogunowicz Jan 24, 2023
5445b11
cleaning up code before merge
KSGulin Jan 24, 2023
b8cbc94
Merge branch 'main' into feature/damian/refactor_function_logger
dbogunowicz Jan 24, 2023
cb9387a
Merge branch 'feature/damian/refactor_function_logger' into feature/d…
dbogunowicz Jan 24, 2023
e8ca2fa
fix build_logger
KSGulin Jan 24, 2023
a433704
merge
dbogunowicz Jan 25, 2023
229bc19
fix quality
dbogunowicz Jan 25, 2023
7991dc3
Merge branch 'feature/damian/refactor_function_logger' into feature/d…
dbogunowicz Jan 25, 2023
d06bdd6
initial commit
dbogunowicz Jan 26, 2023
e50795d
Merge branch 'main' into feature/damian/refactor_function_logger
dbogunowicz Jan 26, 2023
eb97c00
include coreys proposal
dbogunowicz Jan 26, 2023
86eeee9
Merge branch 'feature/damian/refactor_function_logger' into feature/d…
dbogunowicz Jan 26, 2023
a8af672
Merge branch 'main' into feature/damian/config_generation
dbogunowicz Jan 26, 2023
45c5940
checkpoint before big merge
dbogunowicz Jan 26, 2023
90c02d8
m
dbogunowicz Jan 26, 2023
69eef38
cleanup post merge
dbogunowicz Jan 26, 2023
4ff0ac1
complete build-ins
dbogunowicz Jan 26, 2023
41875e4
Merge branch 'main' into feature/damian/config_generation
dbogunowicz Jan 27, 2023
026364e
ready, time to make small prs
dbogunowicz Jan 27, 2023
502310c
merge
dbogunowicz Jan 27, 2023
94ddf52
initial
dbogunowicz Jan 27, 2023
23e5f73
fix noqa
dbogunowicz Jan 27, 2023
182f453
rename decorator on import
dbogunowicz Jan 27, 2023
d4af343
remove test
dbogunowicz Jan 27, 2023
5ff1e45
another test
dbogunowicz Jan 27, 2023
fb9a15b
Merge branch 'main' into feature/damian/deco
dbogunowicz Jan 27, 2023
c9b0a0e
Merge branch 'main' into feature/damian/deco
dbogunowicz Jan 27, 2023
237337a
merge
dbogunowicz Jan 27, 2023
a458166
ready for review
dbogunowicz Jan 30, 2023
e0dca39
add examples
dbogunowicz Jan 30, 2023
190d0e3
Merge branch 'main' into feature/damian/config_generation
dbogunowicz Jan 30, 2023
dcdd81a
ready
dbogunowicz Jan 30, 2023
eea25a7
fixing tests
dbogunowicz Jan 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions examples/data-logging-configs/image_classification.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
loggers:
python: null
data_logging:
pipeline_outputs.labels:
- func: predicted_classes
frequency: 1
- func: predicted_top_score
frequency: 1
pipeline_inputs.images:
- func: image_shape
frequency: 1
- func: mean_pixels_per_channel
frequency: 1
- func: std_pixels_per_channel
frequency: 1
- func: fraction_zeros
frequency: 1
22 changes: 22 additions & 0 deletions examples/data-logging-configs/object_detection.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
loggers:
python: null
data_logging:
pipeline_inputs.images:
- func: image_shape
frequency: 2
- func: mean_pixels_per_channel
frequency: 2
- func: std_pixels_per_channel
frequency: 2
- func: fraction_zeros
frequency: 2
pipeline_outputs.labels:
- func: detected_classes
frequency: 2
- func: number_detected_objects
frequency: 2
pipeline_outputs.scores:
- func: mean_score_per_detection
frequency: 2
- func: std_score_per_detection
frequency: 2
16 changes: 16 additions & 0 deletions examples/data-logging-configs/question_answering.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
loggers:
python: null
data_logging:
pipeline_inputs.question:
- func: string_length
frequency: 3
pipeline_inputs.context:
- func: string_length
frequency: 3
pipeline_outputs:
- func: answer_found
frequency: 3
- func: answer_length
frequency: 3
- func: answer_score
frequency: 3
8 changes: 5 additions & 3 deletions src/deepsparse/loggers/build_logger.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,9 @@ def add_predefined_function_groups(


def parse_out_predefined_function_groups(
metric_functions: List[MetricFunctionConfig], identifier_prefix: str
metric_functions: List[MetricFunctionConfig],
identifier_prefix: Optional[str] = None,
registry: Dict[str, List[MetricFunctionConfig]] = DATA_LOGGING_REGISTRY,
) -> Dict[str, List[MetricFunctionConfig]]:
"""
Given a list of MetricFunctionConfig objects, parse out
Expand All @@ -403,11 +405,11 @@ def parse_out_predefined_function_groups(
for metric_function in metric_functions:
function_group_name = metric_function.func
# fetch the pre-defined data logging configuration from the registry
registered_function_group = DATA_LOGGING_REGISTRY.get(function_group_name)
registered_function_group = registry.get(function_group_name)
if not registered_function_group:
raise ValueError(
f"Unknown function group name: {function_group_name}. "
f"Supported function group names: {list(DATA_LOGGING_REGISTRY.keys())}"
f"Supported function group names: {list(registry.keys())}"
)
for (
registered_identifier,
Expand Down
16 changes: 16 additions & 0 deletions src/deepsparse/loggers/metric_functions/helpers/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# flake8: noqa
from .config_generation import *
181 changes: 181 additions & 0 deletions src/deepsparse/loggers/metric_functions/helpers/config_generation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
Helper functions for generating metric function configs
"""

__all__ = ["data_logging_config_from_predefined"]

import logging
import os
import textwrap
from typing import Any, Dict, List, Optional, Union

import yaml

from deepsparse.loggers.build_logger import parse_out_predefined_function_groups
from deepsparse.loggers.config import MetricFunctionConfig
from deepsparse.loggers.metric_functions.registry import DATA_LOGGING_REGISTRY


_WHITESPACE = " "
_LOGGER = logging.getLogger(__name__)


def data_logging_config_from_predefined(
dbogunowicz marked this conversation as resolved.
Show resolved Hide resolved
group_names: Union[str, List[str]],
frequency: int = 1,
loggers: Optional[Dict[str, Optional[Dict[str, Any]]]] = None,
save_dir: Optional[str] = None,
save_name: str = "data_logging_config.yaml",
registry: Dict[str, Any] = DATA_LOGGING_REGISTRY,
) -> str:
"""
Generate a data logging config yaml string using a predefined
function groups configuration.

:param group_names: A single group name or a list of group names,
that are to be translated into the yaml configuration.
:param loggers: Defines set of loggers that will be used to collect
the data logs. It is dictionary that maps the logger integration
names to their initialization arguments
:param frequency: Optional frequency of the data logging
functions in the resulting yaml configuration. By default,
set to 1
:param save_dir: If provided, the resulting yaml configuration is
saved to the provided directory
:param save_name: If config is saved, it will be under this
filename
:return: A string yaml dict that specifies the data logging
configuration
"""
if isinstance(group_names, str):
group_names = [group_names]

if loggers is None:
loggers = {"python": {}}

metric_functions = [
MetricFunctionConfig(func=group_name, frequency=frequency)
for group_name in group_names
]
data_logging_config = parse_out_predefined_function_groups(
metric_functions=metric_functions, registry=registry
)
data_logging_config_str = _data_logging_config_string(data_logging_config)
loggers_config_str = _loggers_to_config_string(loggers)

config_str = loggers_config_str + "\n\n" + data_logging_config_str

if save_dir:
# save and log
save_path = os.path.join(save_dir, save_name)
parsed_data = yaml.safe_load(config_str)
with open(save_path, "w") as file:
yaml.dump(
parsed_data,
file,
default_flow_style=False,
line_break="\n",
sort_keys=False,
)
_LOGGER.info(f"Saved data logging config to {save_path}")

return config_str


def _loggers_to_config_string(
loggers: Dict[str, Optional[Union[str, List[str]]]]
) -> str:
lines = [_WHITESPACE + line for line in _nested_dict_to_lines(loggers)]
lines.insert(0, "loggers:")
return ("\n").join(lines)


def _data_logging_config_string(
data_logging_config: Dict[str, List[MetricFunctionConfig]],
) -> str:
lines = [_WHITESPACE + line for line in _nested_dict_to_lines(data_logging_config)]
lines.insert(0, "data_logging:")
return ("\n").join(lines)


def _nested_dict_to_lines(
value: Any,
key: Optional[str] = None,
yaml_str_lines: Optional[List[str]] = None,
_level: int = 0,
) -> List[str]:
# converts a nested dictionary to a list of yaml string lines
if yaml_str_lines is None:
yaml_str_lines = []

identation = _WHITESPACE * _level

for new_key, new_value in value.items():
if isinstance(new_value, dict):
yaml_str_lines.append(f"{identation}{new_key}:")
yaml_str_lines = _nested_dict_to_lines(
new_value, new_key, yaml_str_lines, _level + 1
)
elif isinstance(new_value, list):
list_as_str = _metric_functions_configs_to_string(new_value)
yaml_str_lines.append(
f"{new_key}:\n{textwrap.indent(list_as_str, prefix=_WHITESPACE)}"
)
else:
yaml_str_lines.append(f"{identation}{new_key}: {new_value}")

return yaml_str_lines


def _str_list_to_yaml(list_to_convert: List[str]) -> str:
# converts a list of strings to their appropriate yaml string representation
lines_indented = [
textwrap.indent(line, prefix=_WHITESPACE) for line in list_to_convert
]
lines_leading_coma = ["-" + line[1:] for line in lines_indented]
return ("\n").join(lines_leading_coma)


def _metric_functions_configs_to_string(
metric_functions_configs: List[MetricFunctionConfig],
) -> str:
# converts a list of metric function configs to
# their appropriate yaml string representation
return _str_list_to_yaml(
[
_metric_function_config_to_string(config)
for config in metric_functions_configs
]
)


def _metric_function_config_to_string(
metric_function_config: MetricFunctionConfig,
) -> str:
# converts a single metric function config to
# its appropriate yaml string representation
text = (
f"func: {metric_function_config.func}\n"
f"frequency: {metric_function_config.frequency}"
)

target_loggers = metric_function_config.target_loggers
# if target_loggers is not None,
# include it in the yaml string
if target_loggers:
text += f"\ntarget_loggers:\n{textwrap.indent(_str_list_to_yaml(target_loggers), prefix=_WHITESPACE)}" # noqa E501
return text
13 changes: 13 additions & 0 deletions tests/deepsparse/loggers/metric_functions/helpers/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Loading