Skip to content


Repository files navigation


Forest Recover Digital Companion Machine Learning Pipeline Repository

This repository contains all code regarding our models used. This is part of the entire E2E pipeline for our product.

graph LR
    A[Data Collection] --> B[FRDC-ML] --> C[FRDC-UI]

Currently, it's heavily WIP.

Getting Started

I highly recommend reading our website documentation. There contains tutorials and docs on how to use our modules.

Dev Info

    src/                    # All relevant code
        frdc/               # Package/Component Level code
            load/           # Image I/O
            preprocess/     # Image Preprocessing
            train/          # ML Training
            evaluate/       # Model Evaluation
            ...             # ...             # Pipeline Entry Point

    tests/                  # PyTest Tests
        model-tests/        # Tests for each model
        integration-tests/  # Tests that run the entire pipeline
        unit-tests/         # Tests for each component

    poetry.lock             # Poetry managed environment file
    pyproject.toml          # Project-level information: requirements, settings, name, deployment info

    .github/                # GitHub Actions

Our Architecture

This is a classic, simple Python Package architecture, however, we HEAVILY EMPHASIZE encapsulation of each stage. That means, there should never be data that IMPLICITLY persists across stages.

To illustrate this, take a look at how tests/model_tests/chestnut_dec_may/ is written. It pulls in relevant modules from each stage and constructs a pipeline.


Pre-commit Hooks

We use Black and Flake8 as our pre-commit hooks. To install them, run the following commands:

poetry install
pre-commit install

If you're using pip instead of poetry, run the following commands:

pip install pre-commit
pre-commit install

Alternatively, you can use Black configured with your own IDE.