Skip to content

Commit

Permalink
Merge pull request #9 from robertknight/poetry
Browse files Browse the repository at this point in the history
Switch package management to Poetry and update dependencies
  • Loading branch information
robertknight committed Jan 30, 2024
2 parents 9c6c72a + e2c43f5 commit e2b30fe
Show file tree
Hide file tree
Showing 7 changed files with 1,076 additions and 815 deletions.
12 changes: 7 additions & 5 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,13 @@ jobs:
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install pipenv
run: pip install pipenv
- name: Install Poetry
run: pipx install poetry
- name: Install dependencies
run: |
pipenv install --dev
pipenv run pip install torch torchvision
poetry install
poetry run pip install torch torchvision
- name: Check formatting and types
run: pipenv run qa
run: |
poetry run black --check .
poetry run mypy ocrs_models
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
qa:
poetry run black --check .
poetry run mypy ocrs_models
26 changes: 0 additions & 26 deletions Pipfile

This file was deleted.

775 changes: 0 additions & 775 deletions Pipfile.lock

This file was deleted.

18 changes: 9 additions & 9 deletions docs/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,18 +59,18 @@ tar -xf test.tgz

## Set up the training environment

1. Install [Pipenv](https://pipenv.pypa.io/en/latest/)
1. Install [Poetry](https://python-poetry.org)
2. Install dependencies, except for PyTorch:

```
pipenv install --dev
poetry install
```

3. Install the appropriate version of PyTorch for your system, in the virtualenv
created by pipenv:
created by Poetry:

```
pipenv run pip install torch torchvision
poetry run pip install torch torchvision
```

See https://pytorch.org/get-started/locally/ for an appropriate pip command
Expand All @@ -79,7 +79,7 @@ tar -xf test.tgz
4. Start a dummy training run of text detection training to verify everything is working:

```
pipenv run python -m ocrs_models.train_detection hiertext datasets/hiertext/ --max-images 100
poetry run python -m ocrs_models.train_detection hiertext datasets/hiertext/ --max-images 100
```

Wait for one successful epoch of training and validation to complete and then
Expand All @@ -101,7 +101,7 @@ export WANDB_API_KEY=<your_api_key>
To launch a training run for the text detection model, run:

```
pipenv run python -m ocrs_models.train_detection hiertext datasets/hiertext/ \
poetry run python -m ocrs_models.train_detection hiertext datasets/hiertext/ \
--max-epochs 50 \
--batch-size 28
```
Expand All @@ -123,7 +123,7 @@ As training progresses, the latest checkpoint will be saved to
model to ONNX via:

```
pipenv run python -m ocrs_models.train_detection hiertext datasets/hiertext/ \
poetry run python -m ocrs_models.train_detection hiertext datasets/hiertext/ \
--checkpoint text-detection-checkpoint.pt \
--export text-detection.onnx
```
Expand Down Expand Up @@ -151,7 +151,7 @@ ocrs --detect-model custom-detection-model.rten image.jpg
To launch a training run for the text recognition model, run:

```
pipenv run python -m ocrs_models.train_rec hiertext datasets/hiertext/ \
poetry run python -m ocrs_models.train_rec hiertext datasets/hiertext/ \
--max-epochs 50 \
--batch-size 250
```
Expand All @@ -171,7 +171,7 @@ As training progresses, the latest checkpoint will be saved to
ONNX via:

```
pipenv run python -m ocrs_models.train_rec hiertext datasets/hiertext/ \
poetry run python -m ocrs_models.train_rec hiertext datasets/hiertext/ \
--checkpoint text-rec.pt \
--export text-recognition.onnx
```
Expand Down
1,030 changes: 1,030 additions & 0 deletions poetry.lock

Large diffs are not rendered by default.

27 changes: 27 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
[tool.poetry]
name = "ocrs-models"
version = "0.1.0"
description = "PyTorch models for the Ocrs OCR engine"
authors = ["Robert Knight <robertknight@gmail.com>"]
license = "MIT OR Apache-2.0"
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.10"
numpy = "^1.26.3"
pillow = "^10.2.0"
tqdm = "^4.66.1"
opencv-python = "^4.9.0.80"
shapely = "^2.0.2"
wandb = "^0.16.2"
pylev = "^1.4.0"
onnx = "^1.15.0"

[tool.poetry.group.dev.dependencies]
black = "^24.1.1"
mypy = "^1.8.0"
types-Pillow = "^10.2.0.20240125"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

0 comments on commit e2b30fe

Please sign in to comment.