-
Notifications
You must be signed in to change notification settings - Fork 3
/
P5_rasterPackageIntroduction_IV-v1.Rmd
333 lines (213 loc) · 13.1 KB
/
P5_rasterPackageIntroduction_IV-v1.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
---
title: "P5 Spatial data analysis: introduction to raster processing (part-4)"
author: "Joao Goncalves"
date: "25 November 2017"
output:
html_document:
self_contained: no
pdf_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(fig.path = "img/")
knitr::opts_chunk$set(fig.width = 5, fig.height = 4.5, dpi = 80)
```
### Background
-------------------------------------------------------------------------------------------------------
In the fourth part of this tutorial series on spatial data analysis using the `raster` package,
we will explore more functionalities, this time related to time-series analysis of raster data.
We will use an Enhanced Vegetation Index ([EVI](https://en.wikipedia.org/wiki/Enhanced_vegetation_index))
5-year time-series (from year 2012 to 2016) from Terra/MODIS satellite/sensor platform for
the Peneda-Geres National Park (PGNP, in NW Portugal) to develop some examples.
This data corresponds to MODIS's __MOD13Q1 data product__ version-006 ([+info here](https://lpdaac.usgs.gov/dataset_discovery/modis/modis_products_table/mod13q1_v006)) which has 250m
of spatial resolution and, 16-days of temporal resolution (more precisely, the product is generated by
maximum daily value composites for each 16-day period). This means that each year has a total of 23
observations. These data was downloaded from the [EarthData](https://search.earthdata.nasa.gov/search)
platform and later assembled and reprojected to WGS 1984 - UTM 29N Coordinate Reference System (CRS)
using MODIS Reprojection Tool - MRT (sorry but these pre-processing steps are outside the scope of this
tutorial `r emo::ji("grin")`).
In this post we will introduce `RasterBrick`s, a multi-layer raster object typically created from a
multi-layer (or multi-band) file although they can also exist entirely in memory. These objects are
similar to `RasterStack`s, but processing time should be shorter when using a `RasterBrick` (irrespective
if values are on disk or in memory). However these objects are less flexible as they can only point to a
single file, while `RasterStacks` can point to multiple different files.
Besides the `raster` package we will also work with `rts`, which provides classes and methods for manipulating
and processing raster time-series data (e.g. a time-series of satellite images). A raster time-series object
is created by combining a `RasterStack` or `RasterBrick` object (from `raster` package) and a set of dates of class
`POSIXct`, `POSIXt`, `Date`, `timeDate`. The time information in `rts` is then handled by a `xts` object.
The function `rts` is used to build a raster time-series (either a `RasterBrickTS` or `RasterStackTS`) which
is simply a S4 object composed by two slots:
- slot [_raster_]: a `RasterStack` or `RasterBrick` object;
- slot [_time_]: a `xts` object with dates for each layer in the raster object.
One key advantage of using `rts` package is to facilitate the subset, extraction or application of functions
by specific periods using date notation (instead of integer or name indices like in `raster`).
For more information on raster data processing see [here](http://r-exercises.com/tags/raster-data), as well as the [tutorial part-1](https://www.r-exercises.com/2017/11/29/spatial-data-analysis-introduction-to-raster-processing-part-1), [tutorial part-2](https://www.r-exercises.com/2017/12/13/spatial-data-analysis-introduction-to-raster-processing-part-2), and, [tutorial part-3](https://www.r-exercises.com/2018/01/10/spatial-data-analysis-introduction-to-raster-processing-part-3), of this
series.
### Making a raster time-series object with rts package
-------------------------------------------------------------------------------------------------------
First up: download and uncompress the sample data! `r emo::ji("thumbsup")` The .zip archive contains
a multi-layer GeoTIFF file with a 16-day (composite) EVI time-series from 2012 to 2016. This means
that we have a total of 23 images per year and a total of 115 layers in the file (for the whole
five-year series).
```{r P5_download_data, message=FALSE, warning=FALSE, eval=FALSE, echo=TRUE}
## Create a folder named data-raw inside the working directory to place downloaded data
if(!dir.exists("./data-raw")) dir.create("./data-raw")
## If you run into download problems try changing: method = "wget"
download.file("https://github.com/raw/joaofgoncalves/R_exercises_raster_tutorial/master/data/MODIS_EVI_TS_PGNP_MultiBand.zip", "./data-raw/MODIS_EVI_TS_PGNP_MultiBand.zip", method = "auto")
## Uncompress the zip file
unzip("./data-raw/MODIS_EVI_TS_PGNP_MultiBand.zip", exdir = "./data-raw")
```
Creating the `RasterBrick` from the downloaded data is really easy and similar to using the `stack`.
The main advantage is that the input is just one multi-layer file, instead of a vector with multiple files
as usually when using `stack`. We will also change the names of each layer for making them more readable.
Let's see how this goes:
```{r P5_create_raster_brick, message=FALSE, warning=FALSE}
library(raster)
# Load the raster data into a RasterBrick object
rst <- brick("./data-raw/MOD13Q1.2012_2016.PGNP_250m_EVI_16days.tif")
names(rst) <- paste("EVI",1:nlayers(rst),sep="_")
```
MODIS data products such as _MOD13Q1_ (and others alike) use 7-digit long dates composed by the year (4 digits)
followed by the Julian day (3 digits) to identify the reference date of an image composite. For example the date
code 2012001 corresponds to 2012-01-01 (in YYYY-mm-dd format), and, 2012161 to 2012-06-09. Usually these
dates are inscribed in image files or metadata but, since we don't have them, we will generate them first
and then process them so we can obtain a `Date` object for each layer. We can then use these properly
formatted dates in the `rts` function to create a raster time-series (see `?rts` for more details).
```{r P5_generate_MODIS_format_dates}
padIt <- function(x)
if(x<10) paste("00",x,sep="") else if(x<100 & x>=10) paste("0",x, sep="") else return(as.character(x))
padWithZeros <- function(x)
sapply(x, FUN = padIt)
# Generate the MODIS-like dates for each layer
julDay <- padWithZeros(rep(seq(from = 1,to = 365,by = 16),5))
yrs <- as.character(rep(2012:2016, each = 23))
MODISYrJday <- paste(yrs, julDay, sep="")
# Print out the MODIS dates for the year 2012
print(MODISYrJday[1:23])
```
Now that we have our MODIS-like dates (in year and Julian day format) we need to convert them into
a more 'human-readable' format also accepted by `rts` function:
```{r P5_generate_dates_for_rts}
# Extract the year
MOD.getYear<-function(x)
as.integer(sapply(x,FUN=function(x) substr(x,1,4)))
# Extract de julian day
MOD.getDOY<-function(x)
as.integer(sapply(x,FUN=function(x) substr(x,5,7)))
# Process the MODIS-like date into YYYY-mm-dd format as Date object
MOD.getDate<-function(x)
as.Date(sapply(x,FUN=function(x) as.character(as.Date(MOD.getDOY(x)-1,origin=paste(MOD.getYear(x),"01-01",sep="-")))))
MODdates <- MOD.getDate(MODISYrJday)
class(MODdates)
# Check the result for year 2012
print(MODdates[1:23])
```
With the dates vector (class `Date`) for each layer we can generate now make a raster time-series
with `rts` constructor:
```{r P5_make_rts_object, message=FALSE, warning=FALSE}
# Install the rts package
if(!("rts" %in% installed.packages()[,1]))
install.packages(c("rts"), dependencies = TRUE)
library(rts)
rstTS <- rts(rst, MODdates)
```
### Subsetting raster time-series
-------------------------------------------------------------------------------------------------------
With the `RasterBrickTS` object created we can extract subsets of the data for particular periods
or dates (check `?rts::subset` for more details):
```{r P5_rasterTS_subset, fig.width=6.5, eval=FALSE, echo=TRUE}
# Subset a specific period
rstTSsubset1 <- subset(rstTS,"2013-05-15/2014-08-25")
# Subset the whole year of 2012
rstTSsubset2 <- subset(rstTS,"2012")
# Subset years from 2013 to 2014
rstTSsubset3 <- subset(rstTS,"2013/2014")
# Subset all years from (and including) 2014 to the series end
rstTSsubset4 <- subset(rstTS,"2014/")
# Subset all to the end of 2014
rstTSsubset5 <- subset(rstTS,"/2014")
# Subset all to the end of July 2014
rstTSsubset6 <- subset(rstTS,"/2014-06")
# Subset a specific month
rstTSsubset7 <- subset(rstTS,"2016-05")
# Plot the May 2016 data
plot(rstTSsubset7)
```
![](https://github.com/raw/joaofgoncalves/R_exercises_raster_tutorial/master/img/P5_rasterTS_subset-1.png)
As you can see the `subset` function is pretty handy for extracting parts of a time-series.
The date parameter format must left-specified with respect to the standard ISO:8601 time format
"CCYY-MM-DD HH:MM:SS". It is also possible to specify a range of times via the index-based
subsetting, using ISO-recommended "/" as the range operator. Generally it works with "from/to"
dates, where using both is optional. If one side is missing, it is interpreted as a request to
retrieve layers from the beginning, or through the end of the raster time series.
### Apply functions over the time-series
-------------------------------------------------------------------------------------------------------
One of the best application of the `rts` package for processing raster time-series is the ability
to apply a specified function to distinct periods. Let's see how we can do this:
```{r P5_rts_apply_family, fig.width=6.5, eval=FALSE, echo=TRUE}
# Apply function to each quarter
# Mean
rstTS_quarterlyMN <- apply.quarterly(rstTS, FUN = mean, na.rm=TRUE)
# Standard-deviation
rstTS_quarterlySD <- apply.quarterly(rstTS, FUN = sd, na.rm=TRUE)
# Apply function to each year
# Mean
rstTS_yearlyMN <- apply.yearly(rstTS, FUN = mean, na.rm=TRUE)
# Standard-deviation
rstTS_yearlySD <- apply.yearly(rstTS, FUN = sd, na.rm=TRUE)
# Plot the time-series for annual EVI
plot(rstTS_yearlyMN)
```
![](https://github.com/raw/joaofgoncalves/R_exercises_raster_tutorial/master/img/P5_rts_apply_family-1.png)
As we can see, these functions can be very useful for applying specific functions over
certain calendar periods without the hassle of having to specify indices - you simply
work with dates which is nicer! `r emo::ji("thumbsup")` `r emo::ji("wink")`
However in certain cases you may want to work with the "simpler" / more general functions
that the `raster` package offers to apply functions over a raster time-series. In that case
`calc` and `stackApply` can be used.
The difference between these functions is that `calc` applies the defined function
over the whole series pixel-by-pixel while `stack` applies a function on subsets of a
`RasterStack` or `RasterBrick`. For this function, the layers to be combined are indicated
by an integer vector with indices. The function used should return a single value, and the number
of layers in the output `Raster*` equals the number of unique values in indices. In the
opposite hand, `calc` allows to have a function that outputs multiple values and, in that case,
a multi-layer `Raster*` object is returned with one layer per output value.
Also, keep in mind that for large objects `calc` will compute values chunk by chunk.
This means that for the result of fun to be correct it should not depend on having
access to _all_ values at once. Let's grab some examples.
Using the `calc` function to provide a global average and standard-deviation
of the entire raster time-series:
```{r P5_use_calc_for_mean, eval=FALSE, echo=TRUE}
rstMean <- calc(rst, fun = mean)
rstStd <- calc(rst, fun = sd)
```
Now, let's use `calc` for a multi-value output with a specific function:
```{r P5_use_calc_for_quantiles}
# Calculate quantiles
# NOTE: you have to use na.rm=TRUE to make this work
rstQuantiles <- calc(rst, fun = function(x,...) as.numeric(quantile(x,probs=c(0.05, 0.5, 0.95),...)), na.rm=TRUE)
print(rstQuantiles)
```
_Et voilà!_ `r emo::ji("wow")` We have three layers, one for each calculated quantile.
Now, switching for `stackApply` we will emulate the behavior of rts `apply.yearly` function:
```{r P5_use_stackApply_for_mean, echo=TRUE, eval=FALSE}
rstYrMean <- stackApply(rst, fun=mean, indices = rep(1:5,each=23))
```
Well... To be honest I was not aware of this but, strangely `rts` function took much more time to
calculate the annual averages... Check out the comparison below:
```{r P5_compare_times_between_rts_and_raster}
# stackApply with RasterBrick
system.time({rstYrMean <- stackApply(rst, fun=mean, indices = rep(1:5,each=23))})
# apply.yearly with RasterBrickTS
system.time({rstTS_yearlyMN <- apply.yearly(rstTS, FUN = mean, na.rm=TRUE)})
```
Let's check if the data is equal to be sure:
```{r P5_check_layers}
for(i in 1:nlayers(rstYrMean))
print(compareRaster(rstYrMean[[i]], rstTS_yearlyMN@raster[[i]], values=TRUE))
```
Yup, all the layers are equal...
Not sure why this is happening though... `r emo::ji("thinking")`. Do you have some idea / comment on this?
This concludes our exploration of the raster and the rts packages for this post. We have covered a couple
of useful things for processing raster time-series. Hope you find it useful! `r emo::ji("smile")`
`r emo::ji("thumbsup")` `r emo::ji("thumbsup")`