allow regions to be string format #28

nnguyen622 · 2022-08-30T16:26:22Z

This PR is to allow grid to be other format in file name. Instead of an integer of six character length, this allows string format, integer format, etcc...

scripts/config.yml - remove region part + add new features
scripts/create_dataset.py – in persist_dataset, change regions into a list of hash polygons instead of CONFIG['regions']
src/cultionet/scripts/cultionet.py – in persist_dataset, change regions into a list of hash polygons instead of CONFIG['regions']
src/cultionet/utils/model_preprocessing.py – in class TrainInputs, attrs_post_init(self), remove region_list as list of range number into region_list = self.regions

with newer version of pytorch-lighting, some parameter for ModelCheckpoint becomes invalid

jgrss · 2022-09-08T20:21:00Z

src/cultionet/model.py

@@ -99,7 +99,7 @@ def fit(
        mode='min',
        monitor='loss',
        every_n_train_steps=0,
-        every_n_val_epochs=1
+        every_n_val_epochs=1 # this parameter will become invalid with newer version of pytorch-lightning


Yeah, should be a follow-up PR to change this to every_n_epochs and upgrade pytorch-lightning.

jgrss · 2022-09-08T20:23:41Z

src/cultionet/scripts/config.yml

-regions:
-  - 1
-  - 1
+  - green


@nnguyen622 can you explain this one? This would impact the example data, e.g.

@jgrss , In this example, I produced data with evi2, gcvi, kndvi, and green indices. I removed region part because I hard-coded the geo_id list (as seen below)
However, I realized that the config.yml can be changed whenever users generate training files. As such, maybe we don't need to change the config.yml in this PR?

Ahh, that's my fault. I completely missed that the 'regions' was removed. Let's see if we can leave the config 'regions' as an option and add your request from the CLI. What do you think?

@jgrss , I should have provided more context. I think your suggested code will allow regions to still be in the yml file 👍
I will update once I test out the code generating training files

jgrss · 2022-09-08T20:24:47Z

src/cultionet/scripts/cultionet.py

 import yaml


 logger = logging.getLogger(__name__)

+geo_id_data = pd.read_csv('~/geo_id_grid_list.csv')


Hard-coding this here could be problematic. Checking to see how it's used down below....

jgrss · 2022-09-08T20:31:30Z

src/cultionet/scripts/cultionet.py

@@ -213,7 +217,7 @@ def persist_dataset(args):
    ref_res_lists = [args.ref_res]

    inputs = model_preprocessing.TrainInputs(
-        regions=config['regions'],


I like the flexibility of being able to use a different source for the region ids. However, pinning a file and opening as a global overrides the existing configuration that works with the example data and with other ongoing projects.

My suggestion would be to add a CLI parameter, or parameters, that would allow us to pass a region id file that would override the configuration integer range.

I think something like the following:

if hasattr(args, 'region_id_file'): if not Path(args.region_id_file).is_file(): raise IOError('The id file does not exist') id_data = pd.read_csv(args.region_id_file) # How do you feel about an expected column name of 'id'? regions = id_data['id'].unique().tolist() # Otherwise, # regions = id_data[args.id_column].unique().tolist() else: regions = list(range(config['regions'][0], config['regions'][1]+1)) inputs = model_preprocessing.TrainInputs(.... regions=regions, )

@jgrss , I agree. Hard-coding the geo_id file in can be problematic.
I really your idea of passing the file in through CLI. I will implement following your suggestion 👍

jgrss · 2022-09-08T20:35:09Z

src/cultionet/utils/model_preprocessing.py

-        start_region = self.regions[0]
-        end_region = self.regions[1]
-        region_list = list(range(start_region, end_region+1))
+        region_list = self.regions


jgrss

I like the idea of additional methods to pass the region ids as long as it doesn't remove the existing method. See comments for suggestions.

nnguyen622 · 2022-09-08T20:41:24Z

I like the idea of additional methods to pass the region ids as long as it doesn't remove the existing method. See comments for suggestions.

Thanks @jgrss , I will make changes accordingly 👍

remove hard-coded geo_id list pass id list through a file location in CLI

nnguyen622 · 2022-09-09T13:52:50Z

src/cultionet/scripts/config.yml


+region_id_file:


@jgrss , what do you think if we add region_id_file here in the config file instead of adding additional argument to the CLI?

@nnguyen622 that could work with a default of !!null. And then you'll add a check for both of those options?

hi @jgrss, good point. I will add in check for both of these options. 👍

add check to make sure only one option of region is submitted

jgrss · 2022-09-13T19:20:42Z

src/cultionet/scripts/cultionet.py

+
+
+    if region_as_file:
+      file_path = config['region_id_file']


Looks good -- one very minor comment. It looks like the indentation is 2 spaces instead of 4? We should probably just add an automatic formatter like Black.

@jgrss , good catch. Let me fix that right now.
Next time I will run black before comitting 👍

Don't worry about running Black. We can add it as a pre-commit.

add null value to region_id_file remove green index from image_vis, users can alter their own config.yml file

add comments for each parameter

allow regions to be string format

c67dd7a

nnguyen622 marked this pull request as draft August 30, 2022 16:26

nnguyen622 added 3 commits August 30, 2022 12:30

add additional features

b7755b7

update regions to be input geo_id_list

dddc9e3

add comments

f66a248

with newer version of pytorch-lighting, some parameter for ModelCheckpoint becomes invalid

nnguyen622 marked this pull request as ready for review September 7, 2022 18:27

jgrss self-requested a review September 7, 2022 18:31

jgrss reviewed Sep 8, 2022

View reviewed changes

jgrss requested changes Sep 8, 2022

View reviewed changes

add region_id_file as added CLI parameter

3c49edb

remove hard-coded geo_id list pass id list through a file location in CLI

nnguyen622 marked this pull request as draft September 8, 2022 21:03

nnguyen622 added 3 commits September 9, 2022 08:20

add region_id_file argument to config file

e442de6

change region file CLI into config element

238158c

add in pandas to read csv file

06b5f90

nnguyen622 commented Sep 9, 2022

View reviewed changes

nnguyen622 requested a review from jgrss September 12, 2022 22:24

nnguyen622 added 3 commits September 13, 2022 09:05

add check for region input

747b009

add check to make sure only one option of region is submitted

fix typo

5611547

change regions to list

4753f39

jgrss reviewed Sep 13, 2022

View reviewed changes

jgrss approved these changes Sep 13, 2022

View reviewed changes

fix indentation

e9247af

jgrss assigned nnguyen622 Sep 13, 2022

jgrss added the enhancement New feature or request label Sep 13, 2022

nnguyen622 marked this pull request as ready for review September 14, 2022 15:41

nnguyen622 added 2 commits September 14, 2022 11:47

reformat config

60a58a5

add null value to region_id_file remove green index from image_vis, users can alter their own config.yml file

add in comments

846159f

add comments for each parameter

jgrss merged commit e4e0b13 into jgrss:main Sep 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow regions to be string format #28

allow regions to be string format #28

nnguyen622 commented Aug 30, 2022

jgrss Sep 8, 2022

jgrss Sep 8, 2022

nnguyen622 Sep 8, 2022

jgrss Sep 8, 2022

nnguyen622 Sep 8, 2022

jgrss Sep 8, 2022

jgrss Sep 8, 2022 •

edited

Loading

nnguyen622 Sep 8, 2022

jgrss Sep 8, 2022

jgrss left a comment

nnguyen622 commented Sep 8, 2022

nnguyen622 Sep 9, 2022

jgrss Sep 12, 2022

nnguyen622 Sep 13, 2022

jgrss Sep 13, 2022

nnguyen622 Sep 13, 2022

jgrss Sep 14, 2022


		region_id_file:

allow regions to be string format #28

allow regions to be string format #28

Conversation

nnguyen622 commented Aug 30, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgrss Sep 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgrss left a comment

Choose a reason for hiding this comment

nnguyen622 commented Sep 8, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgrss Sep 8, 2022 •

edited

Loading