Skip to content

Environmental data processing for www.emma.eco

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENSE
Unknown
LICENSE.md
Notifications You must be signed in to change notification settings

AdamWilsonLab/emma_envdata

Repository files navigation

EMMA

Ecological Monitoring and Management Application (EMMA)

This is the core repository for environmental data processing in the Ecological Monitoring and Management Application EMMA.io.

EMMA workflow overview

The EMMA workflow consists of four modules, each with a separate Github repo:

  1. The Environmental Data module (https://github.com/AdamWilsonLab/emma_envdata)
  2. The Modelling and Change Detection module (https://github.com/AdamWilsonLab/emma_model)
  3. The Change Classification module (https://github.com/AdamWilsonLab/emma_change_classification)
  4. The Reporting module (https://github.com/AdamWilsonLab/emma_report)

File structure

The most important files are:

├── _targets.R (data processing workflow and dependency management)
├── R/
├──── [data_processing_functions]
├── data/
├──── manual_download (files behind firewalls that must be manually downloaded)
├──── raw_data (raw data files downloaded by the workflow)
├──── processed_data (data processed and stored by the workflow)
└── Readme.Rmd (this file)

Files generated by the workflow are stored in the targets-runs branch. The final output of the workflow is a set of parquet files stored as Github releases with the tag “current”.

Workflow structure

Workflow Notes

Runtime and frequency

Github places some constraints on actions, including memory limits and run time limits. To prevent this workflow from taking too long to run (and thereby losing all progress), there are a few key parameters that can be changed. In the _targets.R file, the argument “max_layers” controls the maximum number of layers that rgee will attempt to download in one action run. When initially setting up the repo, it may be necessary to lower this value and increase the frequency that the targets workflow is run (by adjusting the cron parameters in targets.yaml). Github also limits the rates of requests, and so the file release_data.R includes a call to Sys.sleep that can be adjusted to slow down/speed up the process of pushing data to a Github release.

Data notes

* MODIS NDVI values have been transformed to save space.  To restore them to the original values (between -1 and 1), divide by 100 and subtract 1.
* Untransformed NDVI = (transformed NDVI / 100) - 1
* Raw MODIS fire dates (tag:raw_fire_modis): values are either 0 (no fire) or the day of the year a fire was observed (1 through 366).
* Processed MODIS fire dates (tag: processed_fire_dates: values are either 0 (no fire) or the UNIX date (days since 1 Jan. 1970) a fire was observed.

Data layers

  • Continuous Heat-Insolation Load Index (CHILI; ALOS)
  • Multi-Scale Topographic Position Index (MTPI; compares elevation to surroundings; ALOS)
  • Topographic Diversity (represents the variety of temperature, moisture conditions; ALOS )
  • Mean annual air temperature (CHELSA Bio1)
  • Mean diurnal air temperature range (CHELSA Bio2)
  • Isothermality (ratio of diurnal variation to annual variation in temperatures; CHELSA Bio3)
  • Temperature seasonality(std. deviation of the monthly mean temperatures; CHELSA Bio4)
  • Mean daily maximum air temperature of the warmest month (CHELSA Bio5)
  • Mean daily minimum air temperature of the coldest month (CHELSA Bio6)
  • Annual range of air temperature (CHELSA Bio7)
  • Mean daily mean air temperatures of the wettest quarter (CHELSA Bio8)
  • Mean daily mean air temperatures of the driest quarter (CHELSA Bio9)
  • Mean daily mean air temperatures of the warmest quarter (CHELSA Bio10)
  • Mean daily mean air temperatures of the coldest quarter (CHELSA Bio11)
  • Annual precipitation amount (CHELSA Bio12)
  • Precipitation amount of the wettest month (CHELSA Bio13)
  • Precipitation amount of the driest month (CHELSA Bio14)
  • Precipitation seasonality (CV of the monthly precipitation estimates; CHELSA Bio15)
  • Mean monthly precipitation amount of the wettest quarter (CHELSA Bio16)
  • Mean monthly precipitation amount of the driest quarter (CHELSA Bio17)
  • Mean monthly precipitation amount of the warmest quarter (CHELSA Bio18)
  • Mean monthly precipitation amount of the coldest quarter (CHELSA Bio19)
  • January (mid dry season) precipitation (CHELSA)
  • July (mid wet season) precipitation (CHELSA)
  • Interannual variability in cloud frequency (MODCF)
  • Intraannual variability in cloud frequency (MODCF)
  • Mean annual cloud frequency (MODCF)
  • Cloud frequency seasonality concentration (sum(monthly concentration vectors); MODCF)
  • Elevation (NASA DEM)
  • Soil electrical conductivity (soil_EC_mS_m, Cramer et al. 2019)
  • Soil extractable K (soil_Ext_K_cmol_kg, Cramer et al. 2019)
  • Soil extractable NA (soil_Ext_Na_cmol_kg, Cramer et al. 2019)
  • Soil extractable P(soil_Ext_P_mg_kg, Cramer et al. 2019)
  • Soil pH (Cramer et al. 2019)
  • Total soil C (Cramer et al. 2019)
  • Total soil N (Cramer et al. 2019)
  • Time since fire (generated from MODIS active fire products and CapeNature fire polygons)

Setting up the repo

* This repo requires github credentials.  To store those securely...
* Credentials are decrypted with the function decryp_secret.sh

Extras

* Call `targets::tar_renv(extras = character(0))` to write a `_packages.R` file to expose hidden dependencies.
* Call `renv::init()` to initialize the `renv` lockfile `renv.lock` or `renv::snapshot()` to update it.
* Commit `renv.lock` to your Git repository.