Skip to content

09_Gym_utilities

Daniel Buscombe edited this page Feb 24, 2023 · 1 revision

This page describes the various utility scripts provided with Segmentation Gym

Model utilities

  • gen_fullmodel_from_h5.py is necessary when model training ends early due to a keras crash or computer failure, and the fullmodel file doesn't get written, but the model is still good enough to use.

  • gen_saved_model.py creates a Tensorflow SavedModel format model folder for portable applications

Data utilities

  • make_class_balanced_subset.py for binary problems (NCLASSES=2) where there is severe class imbalance in the npz files created using make_nd_dataset.py. It looks at a folder of npz files made by make_datasets and separates the files where the minority is, respectively, above and below a specified threshold. For making less imbalanced datasets for model training.

  • preprocess_data.py launches the doodleverse_utils functions that make files for use with a trained model (that is, in model inference mode), or for model training. The script is essentially just a wrapper, controlled by a single parameter that controls its behaviour:

    if data_type==0:
        from doodleverse_utils import merge_nd_inputs4pred #make npz files of multiple image bands
    elif data_type==1:
        from doodleverse_utils import make_ndwi_4pred #make npz files containing NDWI imagery for model inference
    elif data_type==2:
        from doodleverse_utils import make_mndwi_4pred #make npz files containing MNDWI imagery for model inference           
    elif data_type==3:
        from doodleverse_utils import make_ndwi_dataset #make npz files containing NDWI imagery for model training
    elif data_type==4:
        from doodleverse_utils import make_mndwi_dataset #make npz files containing MNDWI imagery for model training
    elif data_type==5:
        from doodleverse_utils import vggjson2mask #converts VGG-JSON format annotation files into raster labels in jpeg format

Troubleshooting utilities

  • test_gpus.py is a simple test script to test GPUs and multi-GPU training on a standard keras dataset and model (MNIST hand written digits). For troubleshooting only

  • print_pred_labels.py reads a directory of npz files, and a classes listed in text format, and prints out jpeg overlays of the image and label pairs. For checking the contents of a folder of npz files made by make_nd_dataset.py

Post-processing utilities

  • pred2map.py should be run after running 'seg_images_in_folder.py' to creates a folder of 'predseg' outputs in geospatial formats for mapping and subsequent visualization in a GIS. This is only relevant if the jpegs used for model inference are geospatial imagery with an accompanying worldfile in wld format, and xml metadata file. It is specifically designed for jpegs, wld, and xml files that have been created using the GDAL program gdal_translate

  • batch_pred2map.py does the same as pred2map.py but extends its functionality to apply to lists of folders, in batched mode