jgrss/v170 #49

jgrss · 2023-03-03T12:44:06Z

This PR introduces changes toward v170.

chipindigo · 2023-03-08T00:29:56Z

src/cultionet/callbacks.py

+                    # `num_classes` includes background
+                    'count': 3 + num_classes - 1,
+                    'dtype': 'uint16',
+                    'blockxsize': 64 if 64 < src.gw.ncols else src.gw.ncols,


suggestion:

min(64, src.gw.ncols)

chipindigo · 2023-03-08T00:35:29Z

src/cultionet/callbacks.py

+                    'sharing': False,
+                    'compress': compression
+                }
+        profile['tiled'] = True if max(profile['blockxsize'], profile['blockysize']) >= 16 else False


The True if ... else False part isn't strictly needed. if clarity is the goal, probably better to wrap this expression in a function:

def is_tiled(blockxsize, blockysize, tile_limit=16): return max(blockxsize, blockysize) >= tile_limit

chipindigo · 2023-03-08T00:37:06Z

src/cultionet/callbacks.py

+        )
+        rheight = pad_slice2d[0].stop - pad_slice2d[0].start
+        rwidth = pad_slice2d[1].stop - pad_slice2d[1].start
+        def reshaper(x: torch.Tensor, channel_dims: int) -> torch.Tensor:


might be good to introduce an autoformatter like black

chipindigo · 2023-03-08T00:45:14Z

src/cultionet/data/create.py

-                    train_data = joblib.load(train_path)
-                    if train_data.train_id == train_id:
-                        batch_stored = True
+        aug_method = AugmenterMapping[aug.replace('-', '_')].value


the augmenter names should be consistent, just use _ or - everywhere, or better yet use enums.

chipindigo · 2023-03-08T00:51:41Z

src/cultionet/data/create.py

+        # Clip the edges to the current grid
+        try:
+            grid_edges = gpd.clip(df_edges, row.geometry)
+        except:


you may want to explicitly catch Topology errors, else you may emit misleading warnings when you run into other errors.

chipindigo · 2023-03-08T01:02:25Z

src/cultionet/data/create.py

+                                        window_pad
+                                    ) for window, window_pad in window_chunk
+                                )
+                            pbar_total.update(len(window_chunk))


 def create_dataset(


This thing is getting very long. Probably a good idea to find logical chunks to wrap in functions. See here for some guidelines on how to tell when functions are getting too long: https://stackoverflow.com/questions/475675/when-is-a-function-too-long.

chipindigo · 2023-03-08T01:15:50Z

src/cultionet/data/datasets.py

+                qt = QuadTree(df_unique_locations, force_square=False)
+                qt.split_recursive(max_samples=1)
+                n_val = int(val_frac * len(df_unique_locations.index))
+                df_val_sample = qt.sample(n=n_val)


What does this do? Something regarding the spatial distribution of the validation set, but it's not totally clear to me.

It's the spatially-balanced splitting method (see https://github.com/jgrss/geosample). I've added comments on each step to help clarify this.

chipindigo · 2023-03-08T01:22:51Z

src/cultionet/losses/losses.py

@@ -134,21 +248,61 @@ def tanimoto(y: torch.Tensor, yhat: torch.Tensor) -> torch.Tensor:
 class TanimotoDistLoss(torch.nn.Module):


Same note as above, this probably needs tests.

chipindigo · 2023-03-08T01:24:34Z

src/cultionet/model.py

+        # train_ds, val_ds = dataset.split_train_val_by_partition(
+        #     spatial_partitions=spatial_partitions,
+        #     partition_column=partition_column,
+        #     val_frac=val_frac,
+        #     partition_name=partition_name
+        # )


can we remove this commented-out block?

chipindigo · 2023-03-08T01:26:27Z

src/cultionet/models/base_layers.py

+        # assert dims in (2, 3)
+        # if dims == 2:
+        #     ones = torch.ones((1, channels, 1, 1))
+        # else:
+        #     ones = torch.ones((1, channels, 1, 1, 1))


chipindigo · 2023-03-08T01:34:40Z

src/cultionet/models/enums.py

+import enum
+
+
+class ModelTypes(enum.Enum):


This enum class might be useful here, or elsewhere in the codebase: https://docs.python.org/3/library/enum.html#enum.StrEnum

Works nicely when you want to map enum values to strings of their names.

nnguyen622 · 2023-03-08T15:23:29Z

src/cultionet/models/base_layers.py

+
+class SetActivation(torch.nn.Module):
+    def __init__(
+        self,


nice small class! Now you can just run the followings without run torch.nn.{activation_type}

SetActivation( activation_type, channels=out_channels, dims=2)

nnguyen622 · 2023-03-08T20:01:07Z

src/cultionet/utils/stats.py

+
+    def var(self, unbiased=True):
+        mean = self.mean()[:, None]
+        return self.integrate(
+            lambda x: (x - mean).pow(2)
+        ) / (self.count - (1 if unbiased else 0))
+
+    def std(self, unbiased=True):
+        return self.var(unbiased=unbiased).sqrt()


this is interesting! This is similar to torch.var

nnguyen622 · 2023-03-08T22:34:22Z

src/cultionet/scripts/cultionet.py

+                if len(ts_list) <= 1:
+                    pbar.update(1)
+                    pbar.set_description('TS too short')


so hypothetically, if I only have 20210101.tif and 20220101.tif in my features, this function will continue?

nnguyen622 · 2023-03-08T22:37:46Z

src/cultionet/scripts/cultionet.py

+def generate_model_graph(args):
+    from cultionet.models.convstar import StarRNN
+    from cultionet.models.nunet import ResUNet3Psi


was there a reason we import inside a function for this one?
Is it because the imports are only relevant for this function + take a long time to import?

Is it because the imports are only relevant for this function

This -- this function only serves the purpose of creating .onnx files for viewing graphs. It's called in isolation and there's no need for the imports elsewhere.

nnguyen622 · 2023-03-08T22:38:49Z

src/cultionet/scripts/cultionet.py

+    with open(
+        project_path / f"{args.process}_command_{now.strftime('%Y%m%d-%H%M')}.json", mode='w'


chipindigo · 2023-03-09T04:26:43Z

src/cultionet/models/nunet.py

+            out_3_1=out_3_1,
+            out_2_2=out_2_2,
+            out_1_3=out_1_3
+        )

        return out


This file does seem verbose, with a lot of repeated blocks. Attempting to reduce repetition could be the subject of a future PR.

chipindigo · 2023-03-09T04:32:03Z

src/cultionet/utils/stats.py

@@ -0,0 +1,798 @@
+"""


Lots of formatting stuff in this file. Linters will likely get it all.

* add flake8 to precommit * add black and flake8 to pyproject.toml * change flake8 repo * add install test extras * simplify checks * black formatting * created CONTRIBUTING file * format * format * format * sync names * format * format * format * remove unused function * format * moved line * format * format * format * format * format * format * format * format * format * format * format * format * format * format * format * format * use StrEnum * remove StrEnum * add version comment * format * format * fix: jgrss/refine (#58) * format * test * add missing reshape * remove edge temperature * removed edge refine layer * format * format * remove sigmoid * remove temperature override * increase lr * fixed arg name * add bash scripts * update docstring * fix: jgrss/refine (#59) * format * fix arg * use all data for refinement * add random sampler for refinement * format * format * remove old arg * format

jgrss added 30 commits February 8, 2023 12:09

add softmax in model

94e9516

fix dim check

14299a0

move dim arg

f65506d

turn off attention

fa558b8

remove softmax from model

062dd15

crop to sigmoid

855a826

fix predictions

b859211

make predict function

cad566f

add activation check

0e5c4db

deep supervision on 1 class

507f1a9

deep supervision on 1 class

fec5a95

add activation check

6d20a51

remove attention

9bedfab

fix attribute

8a61d45

deep supervision dim of 2 for weighting

3e550c3

add dim check

dd52f7e

use softmax for crop layer

693834d

remove final softmax

026be0f

fix unet connections

c72240d

comments

bdd8594

turn off attention

49a2d24

get attention hidden

e12b51f

remove check

2f31385

added conv options

5e3ddff

replaced RNN with temporal convolution

4a0b202

unset

cb69073

use dilation

a9379a5

use attention

824c5f0

fix time conv

6daae29

test

bc3f291

chipindigo reviewed Mar 8, 2023

View reviewed changes

jgrss mentioned this pull request Mar 8, 2023

jgrss/PR changes #57

Merged

nnguyen622 reviewed Mar 8, 2023

View reviewed changes

chipindigo reviewed Mar 9, 2023

View reviewed changes

src/cultionet/utils/stats.py

@@ -0,0 +1,798 @@

"""

Copy link

Collaborator

chipindigo Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of formatting stuff in this file. Linters will likely get it all.

chipindigo approved these changes Mar 9, 2023

View reviewed changes

jgrss and others added 8 commits March 9, 2023 11:01

fix file check

f078ae2

use dataset rng

f05b0c3

remove numpy import

a6385b8

fix attention sequence

59aaabf

remove name

646d291

reformat

e3bf593

reformat

3b3682f

jgrss merged commit 4135817 into main Mar 9, 2023

jgrss deleted the jgrss/topo_v2_time_rnn_nobal_batch_act_dgm_test branch March 9, 2023 18:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jgrss/v170 #49

jgrss/v170 #49

jgrss commented Mar 3, 2023 •

edited

Loading

chipindigo Mar 8, 2023

chipindigo Mar 8, 2023

chipindigo Mar 8, 2023

jgrss Mar 8, 2023

chipindigo Mar 8, 2023

chipindigo Mar 8, 2023

chipindigo Mar 8, 2023

chipindigo Mar 8, 2023

jgrss Mar 8, 2023

chipindigo Mar 8, 2023

chipindigo Mar 8, 2023

chipindigo Mar 8, 2023

chipindigo Mar 8, 2023

nnguyen622 Mar 8, 2023

nnguyen622 Mar 8, 2023

nnguyen622 Mar 8, 2023

nnguyen622 Mar 8, 2023

jgrss Mar 9, 2023 •

edited

Loading

nnguyen622 Mar 8, 2023

chipindigo Mar 9, 2023

chipindigo Mar 9, 2023

		@@ -134,21 +248,61 @@ def tanimoto(y: torch.Tensor, yhat: torch.Tensor) -> torch.Tensor:
		class TanimotoDistLoss(torch.nn.Module):

		with open(
		project_path / f"{args.process}_command_{now.strftime('%Y%m%d-%H%M')}.json", mode='w'

jgrss/v170 #49

jgrss/v170 #49

Conversation

jgrss commented Mar 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgrss Mar 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgrss commented Mar 3, 2023 •

edited

Loading

jgrss Mar 9, 2023 •

edited

Loading