Architecture

This section gives an overview of the code workflow with a brief description of each sub-folder of the model Architecture.

Figure 1. simple sketch of the training workflow:

Figure 2. simple sketch of the evaluation workflow:

Modules

`atmorep/core`

It is the main repository as it contains the scripts to launch the training (train.py) and evaluation (evaluation.py) runs.

train.py is a wrapper around trainer.py
evaluate.py is a wrapper around evaluator.py
atmorep_model.py is called by either trainer or evaluator. The AtmoRepData class triggers the data loading, while the AtmoRep class defines the AtmoRep model architecture.

`atmorep/config`

It contains just one file with the paths to e.g. the data, model, results folders and some additional infos related to the normalisation settings. In principle you should not touch this config file.

`atmorep/datasets`

It contains the low level classes to handle the data I/O.

multifield_data_sampler.py: most important file of the folder. It contains the functions to load the data, shuffle and store the lat-log-time info of the source cube and the masked tokens.
data_writer.py it contains the functions to write the zarr output.
file_io.py it contains the functions to load the data directly from grid, netcdf, bin.
normalizer_local.py and normalizer_global.py contain the classes to normalise and denormalise the data on the fly during pre- and post-processing.
- "local" = the data are normalised with mean 0 and standard deviation 1 computed separately for each grid point in lat-lon. These normalisation factors have been computed separately for each year/month/field/level combination.
- "global" = the data are normalised with mean 0 and standard deviation 1 using a single value averaged over all lat-lon points. These normalisation factors have been computed separately for each year/month/field/level combination.
Deprecated: data_loader.py, dynamic_field_level.py, static_field.py

`atmorep/training`

It contains a single file bert.py which implements the different masking strategies. The main wrapper function is prepare_batch_BERT_multifield. The functions in this files set the idxs of the masked tokens depending on the chosen strategy and propagate them back to multifield_data_sampler.py

`atmorep/transformer`

This folder contains the definition of the different building blocks of the AtmoRep Transformers.

transformer.py contains a wrapper for the classes below.
transformer_encoder.py and transformer_decoder.py contain the definition of the encoder and decoder networks and forward passes.
transformer_attention.py contains the definition of the attention mechanism for the transformers defined in the two files above.
tail_ensemble.py contains the functions related to the tail networks which form the ensemble set of the AtmoRep prediction.

`atmorep/utils`

The folder contains a miscellanea of useful helper functions for the other classes.

utils.py generic utils file with diverse helper functions.

The AtmoRep Collaboration - last update: April 2024

Website: www.atmorep.org
arXiv: link
analysis: analysis code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

Architecture

Modules

`atmorep/core`

`atmorep/config`

`atmorep/datasets`

`atmorep/training`

`atmorep/transformer`

`atmorep/utils`

Clone this wiki locally