Merge pull request #7 from allenai/refactor-upgrade

Refactor upgrade
allenai · Jan 4, 2024 · 190095c · 190095c
2 parents 1d287fe + 55fd134
commit 190095c
Show file tree

Hide file tree

Showing 3 changed files with 19 additions and 161 deletions.
diff --git a/README.md b/README.md
@@ -24,19 +24,7 @@ This is the first planned release of these models. We anticipate further updates
 
 ## **Dataset**
 
-The Sentinel-1 models in this repository have been trained on a dataset which we are also releasing, and will therefore describe in more detail here.
-
-**Note**: If you are only interested in performing Sentinel-1 inference, you can safely skip this section.
-
-The _metadata_ for this dataset dataset consists of:
-
-1. Approximately 60,000 point labels associated with Sentinel-1 imagery, corresponding to the centers of vessels in the imagery. These labels were produced via hand annotations by subject matter experts.
-2. Attribute labels for a subset of approximately 20,000 of the point labels. The attribute labels describe additional information about the vessels in the imagery, such as their length, width, heading, speed, and vessel type. These attributes were produced by cross-referencing detection labels with AIS records.
-
-All of this metadata is stored in a sqlite database included with this repository: [`./data/metadata.sqlite3`](./data/metadata.sqlite3).
-
-A brief report of some summary statistics of the dataset is also included: [`./data/reports/dataset_report.ipynb`](./data/reports/dataset_report.ipynb).
-<br>
+The annotations are included in data/metadata.sqlite3 with the following schema
 
 ### Metadata Schema
 
@@ -297,23 +285,12 @@ or use an x86 VM to build.
 
 To use the docker image:
 
-1.  Acquire the docker image.
-
-    Pull the latest container from Github container registry:
-
-    ```bash
-    export REPO_OWNER=your-repo-owner-name
-    export REPO_NAME=sentinel-vessel-detection
-    export IMAGE_TAG=latest
-    docker pull ghcr.io/$REPO_OWNER/$REPO_NAME/vessel-detection-download:$IMAGE_TAG
-    ```
-
-    or build it from source by running the following from the top level of this repo:
+1.  Build the docker image.
 
     ```bash
     export GIT_HASH=$(git rev-parse --short HEAD)
     export IMAGE_TAG=$GIT_HASH
-    docker build . -f docker/downloader.dockerfile -t vessel-detection-download:$IMAGE_TAG
+    docker build . -f dockerfile -t vessel-detection-sentinels:$IMAGE_TAG
     ```
 
 2.  Create a [Copernicus Hub](https://scihub.copernicus.eu/dhus/#/home) account, and export your user credentials:
@@ -375,7 +352,7 @@ run specified in the default config [./data/cfg/](./data/config) on a particular
 
 We describe below how to set up an environment on your machine for training (either manually, or via docker).
 
-**Note:** Our docker images implicitly assume an x86 architechture. While you may have luck building these images on other architechtures ( e.g. by using docker's `--platform` arg),
+**Note:** Our docker images implicitly assume an x86 architecture. While you may have luck building these images on other architechtures ( e.g. by using docker's `--platform` arg),
 we have not tested this functionality. For example, we have not tested these builds on ARM machines.
 If you have trouble building the image from source on your machine, you can pull it from ghcr, use the scripts without docker,
 or use an x86 VM to build.
@@ -482,22 +459,11 @@ saved weights, and training and model config. Here `{model_name}` will indicate
 
 You can also use our docker container to avoid setting up the necessary environment manually.
 
-1.  Acquire the docker image.
-
-    Pull the latest container from Github container registry:
-
-    ```bash
-    export REPO_OWNER=your-repo-owner-name
-    export REPO_NAME=sentinel-vessel-detection
-    export IMAGE_TAG=latest
-    docker pull ghcr.io/$REPO_OWNER/$REPO_NAME/vessel-detection-train:$IMAGE_TAG
-    ```
-
-    or build it from source by running the following from the top level of this repo:
+1.  Build the docker image.
 
     ```bash
     export IMAGE_TAG=$(git rev-parse --short HEAD)
-    docker build . -f docker/train.dockerfile -t vessel-detection-train:$IMAGE_TAG
+    docker build . -f Dockerfile -t vessel-detection-sentinels:$IMAGE_TAG
     ```
 
 2.  Place the entire `preprocess` folder generated by the data download script, as well as the [`./data/metadata.sqlite3`](./data/metadata.sqlite3) file, somewhere accessible on your machine.
@@ -589,12 +555,7 @@ saved weights, and training and model config. Here `{model_name}` will indicate
 
 You can perform inference right off the bat with our provided trained weights, or with weights you produce from running our training scripts.
 
-The script `detect.py` allows one to perform inference on a raw (but de-compressed) Sentinel-1 and Sentinel-2 image products. It has
-a few command line options, which you can read about in the source file, or using
-
-```bash
-python3 detect.py --help
-```
+The script `src/main.py` allows one to perform inference on a raw (but de-compressed) Sentinel-1 and Sentinel-2 image products.
 
 ### System Requirements and Performance
 
@@ -678,62 +639,13 @@ To get started in your own environment:
 5.  Install the python packages required for training via pip:
 
     ```bash
-    pip install -r inference_requirements.txt
+    pip install -r requirements.txt
     ```
 
 6.  Run the inference script for the detection model.
 
     To run inference on a single image, without access to historical overlaps:
 
-    ```bash
-    python detect.py \
-    --raw_path=data/ \
-    --scratch_path=data/scratch/ \
-    --output=data/output/ \
-    --scene_id=S1B_IW_GRDH_1SDV_20211130T025211_20211130T025236_029811_038EEF_D350.SAFE \
-    --conf=.9 \
-    --nms_thresh=10 \
-    --save_crops=True \
-    --detector_model_dir=data/models/frcnn_cmp2/3dff445 \
-    --postprocess_model_dir=data/models/attr/c34aa37 \
-    --catalog=sentinel1
-    ```
-
-    To run inference on a single image, using additional context from a single historical overlap of the image:
-
-    ```bash
-    python detect.py \
-    --raw_path=data/ \
-    --scratch_path=data/scratch/ \
-    --output=data/output/ \
-    --scene_id=S1B_IW_GRDH_1SDV_20211130T025211_20211130T025236_029811_038EEF_D350.SAFE \
-    --historical1=S1B_IW_GRDH_1SDV_20211118T025212_20211118T025237_029636_03896D_FCBE.SAFE \
-    --conf=.9 \
-    --nms_thresh=10 \
-    --save_crops=True \
-    --detector_model_dir=data/models/frcnn_cmp2/3dff445 \
-    --postprocess_model_dir=data/models/attr/c34aa37 \
-    --catalog=sentinel1
-    ```
-
-    To run inference on a single image, using additional context from two historical overlaps of the image:
-
-    ```bash
-    python detect.py \
-    --raw_path=data/ \
-    --scratch_path=data/scratch/ \
-    --output=data/output/ \
-    --scene_id=S1B_IW_GRDH_1SDV_20211130T025211_20211130T025236_029811_038EEF_D350.SAFE \
-    --historical1=S1B_IW_GRDH_1SDV_20211118T025212_20211118T025237_029636_03896D_FCBE.SAFE \
-    --historical2=S1B_IW_GRDH_1SDV_20211106T025212_20211106T025237_029461_03840B_6E73.SAFE \
-    --conf=.9 \
-    --nms_thresh=10 \
-    --save_crops=True \
-    --detector_model_dir=data/models/frcnn_cmp2/3dff445 \
-    --postprocess_model_dir=data/models/attr/c34aa37 \
-    --catalog=sentinel1
-    ```
-
 ### Sentinel-2
 
 1.  Download a Sentinel-2 scene (and optionally two historical overlaps of that scene) from the Copernicus Hub UI or API, and place it in a folder of your choosing.
@@ -788,62 +700,13 @@ To get started in your own environment:
 5.  Install the python packages required for training via pip:
 
     ```bash
-    pip install -r inference_requirements.txt
+    pip install -r requirements.txt
     ```
 
 6.  Run the inference script for the detection model.
 
     To run inference on a single image, without access to historical overlaps:
 
-    ```bash
-    python detect.py \
-    --raw_path=data/ \
-    --scratch_path=data/scratch/ \
-    --output=data/output/ \
-    --scene_id=S2A_MSIL1C_20230108T060231_N0509_R091_T42RUN_20230108T062956.SAFE \
-    --conf=.9 \
-    --nms_thresh=10 \
-    --save_crops=True \
-    --detector_model_dir=data/models/frcnn_cmp2/15cddd5-sentinel2-swinv2-small-fpn \
-    --postprocess_model_dir=data/models/attr/e609150-sentinel2-attr-resnet \
-    --catalog=sentinel2
-    ```
-
-    To run inference on a single image, using additional context from a single historical overlap of the image:
-
-    ```bash
-    python detect.py \
-    --raw_path=data/ \
-    --scratch_path=data/scratch/ \
-    --output=data/output/ \
-    --scene_id=S2A_MSIL1C_20230108T060231_N0509_R091_T42RUN_20230108T062956.SAFE \
-    --historical1=S2A_MSIL1C_20230111T061221_N0509_R134_T42RUN_20230111T064344.SAFE \
-    --conf=.9 \
-    --nms_thresh=10 \
-    --save_crops=True \
-    --detector_model_dir=data/models/frcnn_cmp2/15cddd5-sentinel2-swinv2-small-fpn \
-    --postprocess_model_dir=data/models/attr/e609150-sentinel2-attr-resnet \
-    --catalog=sentinel2
-    ```
-
-    To run inference on a single image, using additional context from two historical overlaps of the image:
-
-    ```bash
-    python detect.py \
-    --raw_path=data/ \
-    --scratch_path=data/scratch/ \
-    --output=data/output/ \
-    --scene_id=S2A_MSIL1C_20230108T060231_N0509_R091_T42RUN_20230108T062956.SAFE \
-    --historical1=S2A_MSIL1C_20230111T061221_N0509_R134_T42RUN_20230111T064344.SAFE \
-    --historical2=S2B_MSIL1C_20230106T061239_N0509_R134_T42RUN_20230106T063926.SAFE \
-    --conf=.9 \
-    --nms_thresh=10 \
-    --save_crops=True \
-    --detector_model_dir=data/models/frcnn_cmp2/15cddd5-sentinel2-swinv2-small-fpn \
-    --postprocess_model_dir=data/models/attr/e609150-sentinel2-attr-resnet \
-    --catalog=sentinel2
-    ```
-
     ### Notes (common to both imagery sources):
 
     Here `--raw_path` must point to a directory containing the directory specified by `--scene_id` (and `--historical1` and `--historical2`, if they are provided).
@@ -884,7 +747,7 @@ or use an x86 VM to build.
 
     ```bash
     export IMAGE_TAG=$(git rev-parse --short HEAD)
-    docker build . -f docker/inference.dockerfile -t vessel-detection:$IMAGE_TAG
+    docker build . -f Dockerfile -t vessel-detection:$IMAGE_TAG
     ```
 
 2.  Prepare a machine with at least 16GB RAM, and a GPU w/ >= 8GB memory.
@@ -1133,4 +996,4 @@ Running the inference script will produce at most three types of artifact in the
 1. Skylight-ML (especially Mike Gartner who wrote most of this codebase
 2. PRIOR at AI2 (especially Favyen Bastani and Piper Wolters) contributed considerable expertise to the foundational architectures of both models
 3. European Space Agency for making Sentinel-1 and Sentinel-2 data available to the public.
-4. The [Defense Innovation Unit](https://www.diu.mil/) (DIU) who built the foundation for this work through [xView3](https://iuu.xview.us/). 
+4. The [Defense Innovation Unit](https://www.diu.mil/) (DIU) who built the foundation for this work through [xView3](https://iuu.xview.us/).
diff --git a/example/s1_request.py b/example/s1_request.py
@@ -1,18 +1,5 @@
 """ Use this script to inference the API with locally stored data
 
-This is the docker command that was used before the API was created:
-
-docker run --shm-size 16G --gpus='"device=0"' \
--v /path/to/your/data:/home/vessel_detection/data vessel-detection:$IMAGE_TAG \
---raw_path=/home/vessel_detection/data/ \
---scratch_path=/home/vessel_detection/data/scratch/ \
---output=/home/vessel_detection/data/output/ \
---detector_model_dir=/home/vessel_detection/data/models/frcnn_cmp2/3dff445 \
---postprocess_model_dir=/home/vessel_detection/data/models/attr/c34aa37 \
---historical1=S1B_IW_GRDH_1SDV_20211118T025212_20211118T025237_029636_03896D_FCBE.SAFE \
---historical2=S1B_IW_GRDH_1SDV_20211106T025212_20211106T025237_029461_03840B_6E73.SAFE \
---scene_id=S1B_IW_GRDH_1SDV_20211130T025211_20211130T025236_029811_038EEF_D350.SAFE \
---conf=.9 --nms_thresh=10 --save_crops=True --catalog=sentinel1
 
 """
 import json

diff --git a/src/config/config.yml b/src/config/config.yml
@@ -0,0 +1,8 @@
+main:
+  sentinel2_detector: "/home/vessel_detection/src/model_artifacts/multihead4/221e8ac9-sentinel2-swinv2-base-nohistory-satlas-weights"
+  sentinel2_postprocessor: "/home/vessel_detection/src/model_artifacts/attr/2feb47b-sentinel2-attr-resnet-tci-IR"
+  sentinel1_detector: "/home/vessel_detection/src/model_artifacts/frcnn_cmp2/3dff445"
+  sentinel1_postprocessor: "/home/vessel_detection/src/model_artifacts/attr/c34aa37"
+
+pipeline:
+  CLOUD_MASK_THRESHOLD: 0.5