Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI for singe table synthesizer #86

Merged
merged 16 commits into from
Dec 23, 2023
1 change: 1 addition & 0 deletions .github/workflows/extension.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ jobs:
python -m pip install -e .[test]
- name: Install all packages in example/extension
run: |
python -m pip install -e example/extension/dummyexporter[test]
python -m pip install -e example/extension/dummymetadatainspector[test]
python -m pip install -e example/extension/dummycache[test]
python -m pip install -e example/extension/dummydataconnector[test]
Expand Down
9 changes: 9 additions & 0 deletions docs/source/api_reference/data_exporters/base.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Base Class for DataExporter
=======================

.. autoclass:: sdgx.data_exporters.base.DataExporter
:members:
:undoc-members:
:inherited-members:
:show-inheritance:
:private-members:
10 changes: 10 additions & 0 deletions docs/source/api_reference/data_exporters/csv_exporter.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
CsvExporter
=====================================


.. autoclass:: sdgx.data_exporters.csv_exporter.CsvExporter
:members:
:undoc-members:
:inherited-members:
:show-inheritance:
:private-members:
11 changes: 11 additions & 0 deletions docs/source/api_reference/data_exporters/extension.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
.. _api_reference/data-exporters-extension:

Extension hookspec
============================

.. automodule:: sdgx.data_exporters.extension
:members:
:undoc-members:
:inherited-members:
:show-inheritance:
:private-members:
24 changes: 24 additions & 0 deletions docs/source/api_reference/data_exporters/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
Data Exporter
========================================================

.. toctree::
:maxdepth: 1

Base Class for DataExporter <base>

Built-in DataExporter
-----------------------------

.. toctree::
:maxdepth: 2

CsvExporter <csv_exporter>

Custom DataExporter Relevant
-----------------------------

.. toctree::
:maxdepth: 2

Extension hookspec <extension>
DataExporterManager <manager>
9 changes: 9 additions & 0 deletions docs/source/api_reference/data_exporters/manager.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
DataExporterManager
=================================

.. autoclass:: sdgx.data_exporters.manager.DataExporterManager
:members:
:undoc-members:
:inherited-members:
:show-inheritance:
:private-members:
1 change: 1 addition & 0 deletions docs/source/api_reference/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ API Reference
Data Processor <data_processors/index>
Models <models/index>
Metadata and Inspectors <data_models/index>
Data Exporter <data_exporters/index>
Manager <manager>
Exceptions <exceptions>
Utils <utils>
20 changes: 15 additions & 5 deletions docs/source/developer_guides/extension/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,18 @@ View latest extension example on `GitHub <https://github.com/hitsz-ids/synthetic
Plugin-supported modules
------------------------

- :ref:`Cacher for DataLoader <api_reference/cachers-extension>`
- :ref:`Data Connector <api_reference/data-connectors-extension>`
- :ref:`Data Processor <api_reference/data-processors-extension>`
- :ref:`Inspector for Metadata <api_reference/data-models-inspectors-extension>`
- :ref:`Model <api_reference/models-extension>`
- :ref:`API Reference for extended Data Connector <api_reference/data-connectors-extension>`:
:ref:`Data Connector <Data Connector>` is used to connect to data sources.
- :ref:`API Reference for extended Cacher for DataLoader <api_reference/cachers-extension>`:
:ref:`Cacher <Cacher>` is used for improving performance,
reducing network overhead and support large datasets.
- :ref:`API Reference for extended Data Processor <api_reference/data-processors-extension>`:
:ref:`Data Processor <Data Processor>` is used to pre-process and post-process data.
It is useful for business logic.
- :ref:`API Reference for extended Inspector for Metadata <api_reference/data-models-inspectors-extension>`:
:ref:`Inspector <Inspector>` is used to extract metadata such as patterns, types, etc. from raw data.
- :ref:`API Reference for extended Model <api_reference/models-extension>`:
:ref:`Model <SynthesizerModel>`, the model fitted by processed data and used to generate synthetic data.
- :ref:`API Reference for extended Data Exporter <api_reference/data-exporters-extension>`:
:ref:`Data Exporter <Data Exporter>` is used to export data to somewhere.
Use it in CLI or library way to save your processed data or synthetic data.
22 changes: 22 additions & 0 deletions docs/source/user_guides/cli.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,24 @@
Command Line Interface
==================================================

Command Line Interface(CLI) is designed to simplify the usage of SDG and enable other programs to use SDG in a more convenient way.

There are tow main commands in the CLI:

- ``fit``: For fitting, finetuning, retraining... the model, which will save the final model to a specified path.
- ``sample``: Load existing model and sample synthetic data.

And as SDG supports plug-in system, users can list all available via ``list-{component}`` command.

.. Note::

If you want to use SDG as a library, please refer to :ref:`Use Synthetic Data Generator as a library <Use Synthetic Data Generator as a library>`.

If you want to extend SDG with your own components, please refer to :ref:`Developer guides for extension <Extented Synthetic Data Generator>`.

CLI for synthetic single-table data
--------------------------------------------------

.. click:: sdgx.cli.main:cli
:prog: sdgx
:nested: full
1 change: 1 addition & 0 deletions example/extension/dummyexporter/dummyexporter/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = "0.1.0"
15 changes: 15 additions & 0 deletions example/extension/dummyexporter/dummyexporter/dummyexporter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from __future__ import annotations

from sdgx.data_exporters.base import DataExporter


class MyOwnExporter(DataExporter):
...


from sdgx.data_exporters.extension import hookimpl


@hookimpl
def register(manager):
manager.register("MyOwnExporter", MyOwnExporter)
27 changes: 27 additions & 0 deletions example/extension/dummyexporter/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "dummyexporter"
dependencies = ["sdgx"]
dynamic = ["version"]
requires-python = ">=3.8"
classifiers = [
"Programming Language :: Python :: 3",
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9',
'Programming Language :: Python :: 3.10',
'Programming Language :: Python :: 3.11',
]
[project.optional-dependencies]
test = ["pytest"]

[tool.check-manifest]
ignore = [".*"]

[tool.hatch.version]
path = "dummyexporter/__init__.py"

[project.entry-points."sdgx.data_exporter"]
dummyexporter = "dummyexporter.dummyexporter"
16 changes: 16 additions & 0 deletions example/extension/dummyexporter/tests/test_registed_exporter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import pytest

from sdgx.data_exporters.manager import DataExporterManager


@pytest.fixture
def manager():
yield DataExporterManager()


def test_registed_exporter(manager: DataExporterManager):
assert manager._normalize_name("MyOwnExporter") in manager.registed_exporters


if __name__ == "__main__":
pytest.main(["-vv", "-s", __file__])
Loading