Skip to content

Commit

Permalink
Merge pull request #180 from birdflow-science/more-flux-memory
Browse files Browse the repository at this point in the history
Update calc_flux() and plot_flux()
  • Loading branch information
ethanplunkett committed Apr 12, 2024
2 parents 37dfdd3 + 61fcdfa commit a42cf70
Show file tree
Hide file tree
Showing 13 changed files with 187 additions and 56 deletions.
6 changes: 3 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
Package: BirdFlowR
Title: Predict and Visualize Bird Movement
Version: 0.1.0.9055
Version: 0.1.0.9056
Authors@R:
c(person("Ethan", "Plunkett", email = "plunkett@umass.edu", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-4405-2251")),
person("Daniel", "Sheldon", role = c("aut")) )
comment = c(ORCID = "0000-0003-4405-2251")))
Description: BirdFlowR predicts bird movement and distribution from previously fitted models.
License: MIT + file LICENSE
Encoding: UTF-8
Expand All @@ -17,6 +16,7 @@ Suggests:
knitr,
ragg,
rmarkdown,
SparseArray,
testthat (>= 3.0.0),
withr
Config/testthat/edition: 3
Expand Down
20 changes: 14 additions & 6 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# BirdFlowR 0.1.0.9056
2024-04-04
## Flux III
* `is_between()` now uses a `SparseArray::SparseArray()` for the logical array.
to reduce memory usage
* Labeling updates to `plot_flux()` and `animate_flux()`.


# BirdFlowR 0.1.0.9055
2023-03-29
2024-03-29
## Flux II
* Flux values are now divided by the radius to standardize units at
at P/km/week/ where P is the proportion of the population.
Expand Down Expand Up @@ -53,7 +61,7 @@ calculate and work with non-directional flux / net movement.


# BirdFlowR 0.1.0.9053
2023-03-15
2024-03-15

* New `shrink_birdlfow()` reverses `extend_birdflow()`, returning the model
to it's original extent.
Expand All @@ -62,14 +70,14 @@ calculate and work with non-directional flux / net movement.
the original.

# BirdFlowR 0.1.0.9052
2023-03-14
2024-03-14

* `export_birdflow()` gets new arguments to control output file names.
Default values mimic old behavior.


# BirdFlowR 0.1.0.9051
2023-02-28
2024-02-28

* Add `get_mask()`
* Add `extend_bird_flow()`
Expand All @@ -89,12 +97,12 @@ calculate and work with non-directional flux / net movement.
appropriately with both file structures.

# BirdFlowR 0.1.0.9050
2023-02-22
2024-02-22

* Add test for `preprocess_species()` with a non-default CRS.

# BirdFlowR 0.1.0.9049
2023-01-11
2024-01-11

#### Drop "_clip"

Expand Down
2 changes: 1 addition & 1 deletion R/animate_distr.R
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@
#' # Display
#' print(gif)
#'
#' # mv
#' # Save
#' gif_file <- tempfile("animation", fileext = ".gif")
#' gganimate::save_animation(gif, gif_file)
#' file.remove(gif_file) # cleanup
Expand Down
2 changes: 1 addition & 1 deletion R/animate_flux.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ animate_flux <- function(flux, bf, title = species(bf), ...) {

anim <- p +
ggplot2::facet_null() +
gganimate::transition_manual(frames = .data$date) +
gganimate::transition_manual(frames = .data$label) +
ggplot2::labs(title = title,
subtitle = "{current_frame}")

Expand Down
49 changes: 43 additions & 6 deletions R/calc_flux.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,16 +25,34 @@
#' major geographic features such as coasts, large lakes, mountain ranges, and
#' ecological system boundaries that might result in non-linear migration paths.
#'
#' `calc_flux()` assumes that a line passes by a point if any part of the line
#' is within the radius of the point. This assumption breaks down if the
#' radius is much larger than the movement lengths as points that are ahead of
#' the line may still be within a radius of the line. In the extreme a large
#' enough radius on a point outside of the entire range will capture all the
#' movement. This isn't a problem with
#' the default points and radius as the points ahead of the line will never be
#' within the radius.
#'
#' The default points for `calc_flux()` are aligned with the cell centers as
#' are the movement lines. This alignment means that a very small radius will
#' result in an overestimate of flux. The default value of half the cell size
#' is sufficient for this not to be a problem, as we are capturing and
#' standardizing the units based on the entire cell area that that point
#' represents.
#'
#' @param bf A BirdFlow model
#' @param points A set of points to calculate movement through. If `points` is
#' `NULL` they will default to the BirdFlow model cells that are either active
#' or fall between two active cells. Should be a data frame with `x` and `y`
#' or fall between two active cells. Otherwise a data frame with `x` and `y`
#' columns containing point coordinates in [crs(bf)][crs()].
#' @param radius The radius in meters around the points, used to determine if
#' a path is passing through the point.
#' @param radius The radius in meters around the points used to assess whether
#' a movement line passes by (or through) the point. If a point is farther than
#' `radius` from a great circle line between two cells centers then it is not
#' between them.
#' @param n_directions The number of directional bins to use for recording
#' movement direction. Must be either `1` indicating no direction information
#' or an even number.
#' or an even number. This is a placeholder, currently only `1` is supported.
#' @param format The format to return the results in one of:
#' \describe{
#' \item{`"points"`}{Returns a list with `flux` a matrix or array of
Expand Down Expand Up @@ -68,7 +86,12 @@
#' }
#'
calc_flux <- function(bf, points = NULL, radius = NULL, n_directions = 1,
format = NULL, batch_size = 1e6, check_radius = TRUE) {
format = NULL, batch_size = 5e5, check_radius = TRUE) {

if (!requireNamespace("SparseArray", quietly = TRUE)) {
stop("The SparseArray package is required to use calc_flux(). ",
"Please install it prior to calling this function.")
}

if (n_directions != 1)
stop("Only one directional flux is supported at the moment.")
Expand All @@ -90,6 +113,8 @@ calc_flux <- function(bf, points = NULL, radius = NULL, n_directions = 1,
points <- result$points
radius_km <- result$radius / 1000

bf_msg("Calculating Flux\n")

timesteps <- lookup_timestep_sequence(bf)
transitions <- lookup_transitions(bf)
marginals <- gsub("^T", "M", transitions)
Expand All @@ -104,7 +129,10 @@ calc_flux <- function(bf, points = NULL, radius = NULL, n_directions = 1,


n_pts <- dim(between)[3]

bf_msg(" Processing Marginals\n")
for (i in seq_along(marginals)) {

from <- timesteps[i]
to <- timesteps[i + 1]
mar <- get_marginal(bf, marginals[i])
Expand All @@ -116,10 +144,19 @@ calc_flux <- function(bf, points = NULL, radius = NULL, n_directions = 1,
stopifnot(all(dim(sb)[1:2] == dim(mar))) # verify

for (j in seq_len(n_pts)){
net_movement[j, i] <- sum(mar[sb[, , j]])
# This is a little hacky the marginal is a sparse Matrix::Matrix
# or a standard Matrix (depending on whether bf is sparse)
# sb and between are SparseArray::SparseArray
# SparseArray doesn't support logical sub-setting ("yet"),
# Matrix does. The solution is to convert the selection
# to a standard matrix or a sparse Matrix. The first is empirically a
# lot faster
sel <- as.matrix(sb[, , j]) #, sparse = TRUE)
net_movement[j, i] <- sum(mar[sel])
}
}

bf_msg(" Formatting flux\n")
# Standardize to P of population to pass through KM of transect in a week
net_movement <- net_movement / (radius_km * 2)

Expand Down
81 changes: 57 additions & 24 deletions R/is_between.R
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,15 @@ if (FALSE) {
#' \item{radius}{The radius of the circle in meters.}
#' @keywords internal
is_between <- function(bf, points = NULL, radius = NULL, n_directions = 1,
skip_unconnected = TRUE, batch_size = 1e6,
skip_unconnected = TRUE, batch_size = 1e5,
check_radius = TRUE) {
bf_msg("Generating between array.\n")

if (!requireNamespace("SparseArray", quietly = TRUE)) {
stop("The SparseArray package is required to use is_between(). ",
"Please install it prior to calling this function.")
}

# Force use of s2
o_use_s2 <- sf::sf_use_s2()
on.exit(sf::sf_use_s2(o_use_s2))
Expand All @@ -90,14 +96,18 @@ is_between <- function(bf, points = NULL, radius = NULL, n_directions = 1,
"Set check_radius to FALSE to ignore this advice.")
}
}
bf_msg("Creating points\n")

if (is.null(points)) {
bf_msg(" Creating points\n")


# Create points at all cell centers in the rectangular raster
points <- rasterize_distr(get_distr(bf, 1), bf, format = "dataframe")
points <- points[, c("x", "y", "i")]
active <- points[!is.na(points$i), , drop = FALSE]

bf_msg(" Transforming to spherical coordinates\n")

# Convert to lat long (spherical) coordinates
active_sph <- sf::st_as_sf(active, coords = c("x", "y"), crs = crs(bf)) |>
sf::st_transform(crs = "EPSG:4326")
Expand Down Expand Up @@ -128,19 +138,18 @@ is_between <- function(bf, points = NULL, radius = NULL, n_directions = 1,
points <- points[sv, , drop = FALSE]
}

bf_msg(" Preparing points\n")





bf_msg("Preparing points\n")

# Initialize array (All FALSE) to hold betweenness
between <- array(data = FALSE,
dim = c(n_active(bf), n_active(bf), nrow(points)),
dimnames = list(from = paste0("F_", seq_len(n_active(bf))),
to = paste0("T_", seq_len(n_active(bf))),
loc = paste0("L_", seq_len(nrow(points)))))
# Initialize sparse array (All FALSE) to hold betweenness
# This is a hack because there's no creation method that allows setting a
# dimension, but I can make an empty "random" array.
between <- SparseArray::randomSparseArray(
dim = c(n_active(bf), n_active(bf), nrow(points)),
density = 0) != 0
dimnames(between) <- list(from = paste0("F_", seq_len(n_active(bf))),
to = paste0("T_", seq_len(n_active(bf))),
loc = paste0("L_", seq_len(nrow(points))))


# Generate table of active cell x, y, and i (in birdflow CRS)
Expand Down Expand Up @@ -175,9 +184,6 @@ is_between <- function(bf, points = NULL, radius = NULL, n_directions = 1,
# They represent stop overs or seasonal residence, not migratory movement
pairs <- pairs[pairs$from != pairs$to, , drop = FALSE]




# Drop pairs that aren't ever connected
if (skip_unconnected) {
# Create matrix indicating which active cells are connected to
Expand All @@ -202,21 +208,21 @@ is_between <- function(bf, points = NULL, radius = NULL, n_directions = 1,
to = col(ever_connected)[ever_connected])
connected_pairs$id <- with(connected_pairs, paste(from, "-", to))


pairs <- pairs[pairs$id %in% connected_pairs$id, , drop = FALSE]

}
# To avoid

# Add batch number
# Add batch number to pairs of points
pairs$batch <- rep(seq_len(nrow(pairs)),
each = batch_size,
length.out = nrow(pairs))
n_batch <- max(pairs$batch)

bf_msg("Processing points in ", n_batch, " batches \n")
# Process pairs of active points in batches - converting to great circle
# lines connecting each pair and then determining which evaluation points
# are within the distance
bf_msg(" Processing points in ", n_batch, " batches \n")
for (batch in seq_len(n_batch)) {
bf_msg(round(batch / n_batch * 100, 2), "%\n")
b_pairs <- pairs[pairs$batch == batch, , drop = FALSE]
# Lines connecting pairs of points
lines <- vector(mode = "list", length = nrow(b_pairs))
Expand All @@ -233,11 +239,38 @@ is_between <- function(bf, points = NULL, radius = NULL, n_directions = 1,
# great circle) to each of the great circle lines connecting the pairs.
within_dist <-
sf::st_is_within_distance(lines_sfc, points_sph,
dist = units::set_units(radius, "m"))
dist = units::set_units(radius, "m"))



# Update between
#
# There seems to be overhead for interacting with the sparse array
# Making a matrix with three columns which will contain the index
# of every TRUE cell and then updating between once is much
# faster than updating between in a loop.


# Make an index matrix of additional TRUE cells
# each row is the three dimensional index of a TRUE (between) cell
add_list <- vector(mode = "list", length = nrow(b_pairs))
for (i in seq_len(nrow(b_pairs))) {
between[b_pairs$from[i], b_pairs$to[i], within_dist[[i]]] <- TRUE
between[b_pairs$to[i], b_pairs$from[i], within_dist[[i]]] <- TRUE
n <- length(within_dist[[i]])
add_list[[i]] <- cbind(rep(b_pairs$from[i], n),
rep(b_pairs$to[i], n),
within_dist[[i]])
}

# Forward transitions
add <- do.call(rbind, add_list)
between[add] <- TRUE # individual cell indexing

# Backwards transitions
add <- add[, c(2, 1, 3)]
between[add] <- TRUE

bf_msg(" ", round(batch / n_batch * 100, 2), "%\n")

} # End batch loop

if (FALSE) {
Expand Down
26 changes: 23 additions & 3 deletions R/plot_flux.R
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,15 @@ plot_flux <- function(flux,
distr <- apply(distr, 2, function(x) x / max(x, na.rm = TRUE))
}

# Add "<Month> <mday>" labels as ordered factor
dates <- lubridate::as_date(flux$date)
label <- paste0(lubridate::month(dates, label = TRUE, abbr = FALSE), " ",
lubridate::mday(dates))
ud <- sort(unique(dates))
ul <- paste0(lubridate::month(ud, label = TRUE, abbr = FALSE), " ",
lubridate::mday(ud))
label <- ordered(label, levels = ul)
flux$label <- label

if (is.null(limits)) {
limits <- range(flux$flux, na.rm = TRUE)
Expand Down Expand Up @@ -104,21 +113,23 @@ plot_flux <- function(flux,
ggplot2::ggplot(ggplot2::aes(x = .data$x, y = .data$y,
fill = .data$flux)) +
ggplot2::geom_raster() +
ggplot2::scale_fill_gradientn(colors = gradient_colors)
ggplot2::scale_fill_gradientn(colors = gradient_colors, name = value_label)


# Add facet wrap and title
if (length(transitions > 1)) {
# Multiple transitions, facet wrap on date and add species title
p <- p +
ggplot2::facet_wrap(ggplot2::vars(.data$date)) +
ggplot2::facet_wrap(ggplot2::vars(.data$label)) +
ggplot2::ggtitle(title)
} else {
# Single transition add species title AND date subtitle
p <- p +
ggplot2::ggtitle(title, subtitle = flux$date[1])
ggplot2::ggtitle(title, subtitle = flux$label[1])
}



# Add coastline
if (!is.null(coast_color) && !is.null(coast_linewidth)) {

Expand All @@ -131,5 +142,14 @@ plot_flux <- function(flux,
color = coast_color)
}

# coord_sf is required to adjust coordinates while using geom_sf
# Here we are preventing expanding the extent of the plot.
# Setting the CRS is only necessary when the coastline isn't plotted because
# of NULL coast_color or coast_linewidth
p <- p +
ggplot2::coord_sf(expand = FALSE, crs = crs(bf)) +
ggplot2::theme(axis.title = ggplot2::element_blank(),
strip.background = ggplot2::element_blank())

return(p)
}
Loading

0 comments on commit a42cf70

Please sign in to comment.