Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding CLI and Plugin system #47

Merged
merged 12 commits into from
Dec 4, 2023
46 changes: 21 additions & 25 deletions .github/workflows/build-latest-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,33 +8,29 @@ on:
jobs:
deploy:
strategy:
fail-fast: true
fail-fast: true

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v3

# Build docker images
-
name: Set up QEMU
uses: docker/setup-qemu-action@v2
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
-
name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
-
name: Build and push
id: docker_build_latest
uses: docker/build-push-action@v4
with:
context: .
platforms: linux/amd64,linux/arm64/v8
file: ./docker/Dockerfile
push: true
tags: ${{ secrets.DOCKERHUB_USERNAME }}/sdgx:latest
# Build docker images
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push
id: docker_build_latest
uses: docker/build-push-action@v4
with:
context: .
platforms: linux/amd64,linux/arm64/v8
file: ./docker/Dockerfile
push: true
tags: ${{ secrets.DOCKERHUB_USERNAME }}/sdgx:latest
37 changes: 17 additions & 20 deletions .github/workflows/docker-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,29 @@ name: Testing docker image build

on:
pull_request:
branches: [ "master", "main" ]
branches: ["master", "main"]

jobs:
deploy:
strategy:
fail-fast: true
fail-fast: true

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v3

# Build docker images
-
name: Set up QEMU
uses: docker/setup-qemu-action@v2
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
-
name: Build image
id: docker_build_test
uses: docker/build-push-action@v4
with:
context: .
platforms: linux/amd64,linux/arm64/v8
file: ./docker/Dockerfile
push: false
tags: build
# Build docker images
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build image
id: docker_build_test
uses: docker/build-push-action@v4
with:
context: .
platforms: linux/amd64,linux/arm64/v8
file: ./docker/Dockerfile
push: false
tags: build
43 changes: 19 additions & 24 deletions .github/workflows/docker-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,30 +10,25 @@ permissions:

jobs:
deploy:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
-
name: Set up QEMU
uses: docker/setup-qemu-action@v2
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
-
name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
-
name: Build and push
id: docker_build_release
uses: docker/build-push-action@v4
with:
context: .
platforms: linux/amd64,linux/arm64/v8
file: ./docker/Dockerfile
push: true
tags: ${{ secrets.DOCKERHUB_USERNAME }}/sdgx:${{ github.ref_name }}
- uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push
id: docker_build_release
uses: docker/build-push-action@v4
with:
context: .
platforms: linux/amd64,linux/arm64/v8
file: ./docker/Dockerfile
push: true
tags: ${{ secrets.DOCKERHUB_USERNAME }}/sdgx:${{ github.ref_name }}
6 changes: 3 additions & 3 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,6 @@ jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v3
- uses: pre-commit/action@v3.0.0
- uses: actions/checkout@v3
- uses: actions/setup-python@v3
- uses: pre-commit/action@v3.0.0
49 changes: 24 additions & 25 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,31 +9,30 @@ permissions:

jobs:
deploy:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build twine hatch
- name: Build package
run: python -m build
- name: Publish package
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
- name: Upload dists to release
uses: svenstaro/upload-release-action@v2
with:
repo_token: ${{ secrets.GITHUB_TOKEN }}
file: dist/*
file_glob: true
tag: ${{ github.ref }}
overwrite: true
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: "3.x"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build twine hatch
- name: Build package
run: python -m build
- name: Publish package
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
- name: Upload dists to release
uses: svenstaro/upload-release-action@v2
with:
repo_token: ${{ secrets.GITHUB_TOKEN }}
file: dist/*
file_glob: true
tag: ${{ github.ref }}
overwrite: true
39 changes: 19 additions & 20 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,34 +2,33 @@ name: Python package

on:
push:
branches: [ "master", "main" ]
branches: ["master", "main"]
pull_request:
branches: [ "master", "main" ]
branches: ["master", "main"]

jobs:
build:

runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install -e .[test]
- name: Test with pytest
run: |
pytest -vv --cov=sdgx
- name: Install dependencies for building
run: |
pip install build twine hatch
- name: Test building
run: python -m build
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install -e .[test]
- name: Test with pytest
run: |
pytest -vv --cov=sdgx
- name: Install dependencies for building
run: |
pip install build twine hatch
- name: Test building
run: python -m build
27 changes: 27 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,30 @@ repos:
- id: isort
files: \.py$
args: [--profile=black]

- repo: https://github.com/executablebooks/mdformat
rev: 0.7.17
hooks:
- id: mdformat
additional_dependencies:
[mdformat-gfm, mdformat-frontmatter, mdformat-footnote]

- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v3.1.0"
hooks:
- id: prettier
types_or: [yaml, html, json]

- repo: https://github.com/pre-commit/pygrep-hooks
rev: "v1.10.0"
hooks:
- id: rst-backticks
- id: rst-directive-colons
- id: rst-inline-touching-normal

- repo: https://github.com/pre-commit/mirrors-mypy
rev: "v1.7.1"
hooks:
- id: mypy
files: duetector
stages: [manual]
26 changes: 11 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,20 +19,18 @@ There is no correspondence between synthetic data and real data, and it is not s
In practical applications, there is no need to worry about the risk of privacy leakage.
High-quality synthetic data can also be used in various fields such as data opening, model training and debugging, system development and testing, etc.


## 🎉 Features

+ high performance
+ SDG supports a variety of statistical methods, achieving up to 120 times faster execution speed, and reduces dependence on GPU devices;
+ SDG is optimized for large dataset, consumes less memory than other frameworks or GAN-based algorithms;
+ SDG will continue to track the latest progress in academia and industry, and introduce and support excellent algorithms and models in a timely manner.
+ Rapid deployment in production environment
+ Optimize for actual production needs, improve model performance, reduce memory overhead, and support practical features such as single machine multiple cards, multiple machines multiple cards;
+ Provide technologies required for production environments such as automated deployment, containerization, automated monitoring and alarming, and support rapid one-key deployment;
+ Specially optimized for load balancing and fault tolerance to improve high availability.
+ Privacy enhancements:
+ SDG supports differential privacy, anonymization and other methods to enhance the security of synthetic data.

- high performance
- SDG supports a variety of statistical methods, achieving up to 120 times faster execution speed, and reduces dependence on GPU devices;
- SDG is optimized for large dataset, consumes less memory than other frameworks or GAN-based algorithms;
- SDG will continue to track the latest progress in academia and industry, and introduce and support excellent algorithms and models in a timely manner.
- Rapid deployment in production environment
- Optimize for actual production needs, improve model performance, reduce memory overhead, and support practical features such as single machine multiple cards, multiple machines multiple cards;
- Provide technologies required for production environments such as automated deployment, containerization, automated monitoring and alarming, and support rapid one-key deployment;
- Specially optimized for load balancing and fault tolerance to improve high availability.
- Privacy enhancements:
- SDG supports differential privacy, anonymization and other methods to enhance the security of synthetic data.

## 🔛 Quick Start

Expand Down Expand Up @@ -117,13 +115,12 @@ Synthetic data are as follows:
9 28 State-gov 837932 ... 99 United-States <=50K
```


## 🤝 Join Community

The SDG project was initiated by **Institute of Data Security, Harbin Institute of Technology**. If you are interested in out project, welcome to join our community. We welcome organizations, teams, and individuals who share our commitment to data protection and security through open source:

- Submit an issue by viewing [View First Good Issue](https://github.com/hitsz-ids/synthetic-data-generator/issues/new) or submit a Pull Request。
- For developer documentation, please see [Develop documents].(./docs/develop/readme.md)
- For developer documentation, please see \[Develop documents\].(./docs/develop/readme.md)

## 👩‍🎓 Related Work

Expand All @@ -142,7 +139,6 @@ The SDG project was initiated by **Institute of Data Security, Harbin Institute
- [Rossmann](https://www.kaggle.com/competitions/rossmann-store-sales/data)
- [Telstra](https://www.kaggle.com/competitions/telstra-recruiting-network/data)


## 📄 License

The SDG open source project uses Apache-2.0 license, please refer to the [LICENSE](https://github.com/hitsz-ids/synthetic-data-generator/blob/main/LICENSE).
Loading