Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated GaNDLF documentation for clarity #915

Merged
merged 5 commits into from
Aug 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ This document will help you get started with GaNDLF using a few representative e

## Installation

Please follow the [installation instructions](./setup.md) to install GaNDLF. When the installation is complete, you should end up with the shell that looks like the following, which indicates that the GaNDLF virtual environment has been activated:
Follow the [installation instructions](./setup.md) to install GaNDLF. When the installation is complete, you should end up with the following shell, which indicates that the GaNDLF virtual environment has been activated:

```bash
(venv_gandlf) $> ### subsequent commands go here
Expand All @@ -23,7 +23,7 @@ A codespace will open in a web-based version of [Visual Studio Code](https://cod

## Sample Data

Sample data will be used for our extensive automated unit tests in all examples. You can download the sample data from [this link](https://upenn.box.com/shared/static/y8162xkq1zz5555ye3pwadry2m2e39bs.zip). Example of how to do this from the terminal is shown below:
Sample data will be used for our extensive automated unit tests in all examples. You can download the sample data from [this link](https://upenn.box.com/shared/static/y8162xkq1zz5555ye3pwadry2m2e39bs.zip). An example is shown below:

```bash
# continue from previous shell
Expand Down
7 changes: 6 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
# GaNDLF

The **G**ener**a**lly **N**uanced **D**eep **L**earning **F**ramework (GaNDLF) for segmentation and classification.
The **G**ener**a**lly **N**uanced **D**eep **L**earning **F**ramework (GaNDLF) for reproducible segmentation and classification.

## Why use GaNDLF?
GaNDLF was developed to lower the barrier to AI, enabling reproducibility, translation, and deployment.
As an out-of-the-box solution, GaNDLF alleviates the need to build from scratch. Users may kickstart their project
by modifying only **a configuration (config) file** that provides guidelines for the envisioned pipeline
and **CSV inputs** that describe the training data.

## Range of GaNDLF functionalities:
- Supports multiple
- Deep Learning model architectures
- Channels/modalities
Expand Down
16 changes: 9 additions & 7 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Please follow the [installation instructions](./setup.md#installation) to instal

### Anonymize Data

A major reason why one would want to anonymize data is to ensure that trained models do not inadvertently do not encode protect health information [[1](https://doi.org/10.1145/3436755),[2](https://doi.org/10.1038/s42256-020-0186-1)]. GaNDLF can anonymize single images or a collection of images using the `gandlf anonymizer` command. It can be used as follows:
A major reason why one would want to anonymize data is to ensure that trained models do not inadvertently encode protected health information [[1](https://doi.org/10.1145/3436755),[2](https://doi.org/10.1038/s42256-020-0186-1)]. GaNDLF can anonymize one or multiple images using the `gandlf anonymizer` command as follows:

```bash
# continue from previous shell
Expand Down Expand Up @@ -81,7 +81,7 @@ Once these files are present, the patch miner can be run using the following com

### Running preprocessing before training/inference (optional)

Running preprocessing before training/inference is optional, but recommended. It will significantly reduce the computational footprint during training/inference at the expense of larger storage requirements. To run preprocessing before training/inference you can use the following command, which will save the processed data in `./experiment_0/output_dir/` with a new data CSV and the corresponding model configuration:
Running preprocessing before training/inference is optional, but recommended. It will significantly reduce the computational footprint during training/inference at the expense of larger storage requirements. Use the following command, which will save the processed data in `./experiment_0/output_dir/` with a new data CSV and the corresponding model configuration:

```bash
# continue from previous shell
Expand All @@ -108,7 +108,7 @@ N,/full/path/N/0.nii.gz,/full/path/N/1.nii.gz,...,/full/path/N/X.nii.gz,/full/pa
**Notes:**

- `Channel` can be substituted with `Modality` or `Image`
- `Label` can be substituted with `Mask` or `Segmentation`and is used to specify the annotation file for segmentation models
- `Label` can be substituted with `Mask` or `Segmentation` and is used to specify the annotation file for segmentation models
- For classification/regression, add a column called `ValueToPredict`. Currently, we are supporting only a single value prediction per model.
- Only a single `Label` or `ValueToPredict` header should be passed
- Multiple segmentation classes should be in a single file with unique label numbers.
Expand Down Expand Up @@ -152,14 +152,14 @@ The following command shows how the script works:
(venv_gandlf) $> gandlf construct-csv \
# -h, --help Show help message and exit
-i $DATA_DIRECTORY # this is the main data directory
-c _t1.nii.gz,_t1ce.nii.gz,_t2.nii.gz,_flair.nii.gz \ # an example image identifier for 4 structural brain MR sequences for BraTS, and can be changed based on your data
-c _t1.nii.gz,_t1ce.nii.gz,_t2.nii.gz,_flair.nii.gz \ # an example image identifier for 4 structural brain MR sequences for BraTS, and can be changed based on your data. In the simplest case of a single modality, a ".nii.gz" will suffice
-l _seg.nii.gz \ # an example label identifier - not needed for regression/classification, and can be changed based on your data
-o ./experiment_0/train_data.csv # output CSV to be used for training
```

**Notes**:

- For classification/regression, add a column called `ValueToPredict`. Currently, we are supporting only a single value prediction per model.
- For classification/regression, add a column called `ValueToPredict`. Currently, we support only a single value prediction per model.
- `SubjectID` or `PatientName` is used to ensure that the randomized split is done per-subject rather than per-image.
- For data arrangement different to what is described above, a customized script will need to be written to generate the CSV, or you can enter the data manually into the CSV.

Expand All @@ -179,13 +179,15 @@ To split the data CSV into training, validation, and testing CSVs, the `gandlf s

## Customize the Training

GaNDLF requires a YAML-based configuration that controls various aspects of the training/inference process. There are multiple samples for users to start as their baseline for further customization. A list of the available samples is presented as follows:
Adapting GaNDLF to your needs boils down to modifying a YAML-based configuration file which controls the parameters of training and inference. Below is a list of available samples for users to start as their baseline for further customization:

- [Sample showing all the available options](https://github.com/mlcommons/GaNDLF/blob/master/samples/config_all_options.yaml)
- [Segmentation example](https://github.com/mlcommons/GaNDLF/blob/master/samples/config_segmentation_brats.yaml)
- [Regression example](https://github.com/mlcommons/GaNDLF/blob/master/samples/config_regression.yaml)
- [Classification example](https://github.com/mlcommons/GaNDLF/blob/master/samples/config_classification.yaml)

To find **all the parameters** a GaNDLF config may modify, consult the following file:
- [All available options](https://github.com/mlcommons/GaNDLF/blob/master/samples/config_all_options.yaml)

**Notes**:

- More details on the configuration options are available in the [customization page](customize.md).
Expand Down
Loading