Skip to content

DIUx-xView/SARFish

Repository files navigation

License: Apache 2.0

SARFish challenge repository

SARFish [1] is an imagery dataset for the purpose of training, validating and testing supervised machine learning models on the task of ship detection and classification. SARFish builds on the excellent work of the xView3-SAR dataset by expanding the imagery data to include Single Look Complex (SLC) as well as Ground Range Detected (GRD) imagery data taken directly from the European Space Agency (ESA) Copernicus Programme Open Access Hub Website.

Links:

How to use this repo

1. SARFish_Terms_and_Conditions.md

Read the terms and conditions for the:

  • use of the SARFish dataset
  • use of this repo
  • participation in the SARFish challenge.

2. Install gdal

$ <package-manager> install g++ python-devel python3-devel gdal gdal-devel 

2. venv_setup/venv_create.sh

Create the virtual environment with necessary packages.

Note: Requires >python3.8

$ ./venv_setup/venv_create.sh -v venv -r ./venv_setup/venv_requirement.txt
$ source ./venv/bin/activate

Edit the file reference/environment.yaml to set path to the root directory of the SARFish dataset:

SARFish_root_directory: /path/to/SARFish/root/ 

3. Download the data

The SARFish dataset is available for download at:

full SARFish dataset sample SARFish dataset

dataset coincident GRD, SLC products compressed (GB) uncompressed (GB)
SARFishSample 1 4.3 8.2
SARFish 753 3293 6468

Full SARFish dataset

Make sure you have at least enough storage space for the uncompressed dataset.

cd /path/to/large/storage/location

[Create|login] to a huggingface account.

In your python3 virtual environment login to the huggingface command line interface.

huggingface-cli login

Install git lfs

<package-manager> install git-lfs
git lfs install

Copy the access token in settings/Access Tokens from your huggingface account. Clone the dataset

git clone https://huggingface.co/datasets/ConnorLuckettDSTG/SARFish

SARFish sample dataset

Substitute the final command for the full dataset with the following:

git clone https://huggingface.co/datasets/ConnorLuckettDSTG/SARFishSample

4. check_SARFish_md5sum.py

Check the md5 sums of the downloaded SARFish products

./check_SARFish_md5sum.py

5. unzip_batch.sh

Unzip SARFish data products.

cd /path/to/SARFish/directory/GRD
unzip\_batch.sh -p $(find './' -type f -name "*.SAFE.zip")

cd /path/to/SARFish/directory/SLC
unzip\_batch.sh -p $(find './' -type f -name "*.SAFE.zip")

6. Download the SARFish labels

Download the training and validation label files for both the GRD and SLC products from the xView3 website

Add the label files to their respective partitions in the dataset file structure:

SARFish/
├── GRD
│   ├── public
│   ├── train
│   │   └── GRD_train.csv
│   └── validation
│       └── GRD_validation.csv
└── SLC
    ├── public
    ├── train
    │   └── SLC_train.csv
    └── validation
        └── SLC_validation.csv

7. Get started with the SARFish dataset and challenge: Run the SARFish_demo.ipynb notebook

python3 -m jupyter notebook reference/SARFish_demo.ipynb

The SARFish demo is jupyter notebook to help users understand:

  • What is the SARFish Challenge?
  • What is the SARFish dataset?
  • How to access the SARFish dataset
  • Dataset structure
  • How to load and visualise the SARFish imagery data
  • How to load and visualise the SARFish groundtruth labels
  • How to train, validate and test the reference/baseline model
  • SARFish challenge prediction submission format
  • How to evaluate model performance using the SARFish metric

8. Train and evaluate the baseline reference model

A baseline reference implementation of a real-valued deep learning model is provided for the purpose of introducing new users to training and validating, testing models on the SARFish SLC data in addition to illustrating the use of the SARFish metrics. The reference model demonstrates how to use the SARFish metrics during training, testing and evaluation to help inform the development of better performing models.

The baseline uses the predefined PyTorch implementation of FCOS; chosen because it uses the concept of “centre-ness”, which we believe is applicable to the maritime objects in this dataset.

SARModel.py

The baseline can be trained and evaluated by sequentially running the following scripts:

1_create_tile.py generates the tiles used for training the baseline. Approximately 300GB is required for storage.

./1_create_tile.py

The following trains, validates and tests the baseline model n a small subset of the SARFish dataset detailed in fold.csv.

./2_train.py
./3_test.py

4_evaluate.py calls the SARFish_metric.py script on the testing scenes to determine model peformance on the SARFish challenge tasks.

./4_evaluate.py

The following scripts call the model over the entire public partition of the SARFish dataset to generate the submission/predictions uploaded to the Kaggle competition as the benchmark.

./5_inference.py
./6_concatenate_scene_predictions.py

9. reference/SARFish_metric.py

Evaluate the baseline model's performance on a scene from the validation partition using the metrics for the SARFish dataset.

./SARFish_metric.py \
    -p labels/reference_model/reference_predictions_SLC_validation_S1B_IW_SLC__1SDV_20200803T075720_20200803T075748_022756_02B2FF_E5D2.csv \
    -g /path/to/SARFish/root/SLC/validation/SLC_validation.csv \
    --sarfish_root_directory /path/to/SARFish/root/ \
    --product_type SLC \
    --xview3_slc_grd_correspondences labels/xView3_SLC_GRD_correspondences.csv \
    --shore_type xView3_shoreline \
    --no-evaluation-mode

[1] T.-T. Cao et al., “SARFish: Space-Based Maritime Surveillance Using Complex Synthetic Aperture Radar Imagery,” in 2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2022, pp. 1–8.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published