Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add function edge nn #12

Merged
merged 24 commits into from
Apr 5, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
export(build_lines)
export(build_polys)
export(edge_dist)
export(edge_nn)
export(get_gbi)
export(group_lines)
export(group_polys)
Expand Down
6 changes: 4 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# v 0.1.8
* update [FAQ](http://spatsoc.robitalec.ca/articles/faq.html) and [Introduction to spatsoc](http://spatsoc.robitalec.ca/articles/intro-spatsoc.html) vignettes adding entries for edge list generating functions.
* added edge list generating function `edge_nn` ([PR 11](https://github.com/ropensci/spatsoc/pull/12))
* added edge list generating function `edge_dist` ([PR 11](https://github.com/ropensci/spatsoc/pull/11))


Expand Down Expand Up @@ -32,10 +34,10 @@
# v 0.1.1 (2018-09-17)

* improvements to package, function documentation
* [FAQ](https://spatsoc.gitlab.io/articles/faq.html) vignette added
* [FAQ](http://spatsoc.robitalec.ca/articles/faq.html) vignette added
* fixed `build_lines` ordering bug to ensure rows are ordered by date time when building lines
* added CODE_OF_CONDUCT.md and CONTRIBUTING.md
* [Using spatsoc in social network analysis](https://spatsoc.gitlab.io/articles/using-in-sna.html) vignette added
* [Using spatsoc in social network analysis](http://spatsoc.robitalec.ca/articles/using-in-sna.html) vignette added

# v 0.1.0 (2018-07-20)

Expand Down
146 changes: 146 additions & 0 deletions R/edge_nn.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
#' Nearest neighbour based edge lists
#'
#'
#' \code{edge_nn} returns edge lists defined by the nearest neighbour. The function accepts a \code{data.table} with relocation data, individual identifiers and a threshold argument. The threshold argument is used to specify the criteria for distance between points which defines a group. Relocation data should be in two columns representing the X and Y coordinates.
#'
#'
#' The \code{DT} must be a \code{data.table}. If your data is a \code{data.frame}, you can convert it by reference using \code{\link[data.table:setDT]{data.table::setDT}}.
#'
#' The \code{id}, \code{coords} (and optional \code{timegroup} and \code{splitBy}) arguments expect the names of a column in \code{DT} which correspond to the individual identifier, X and Y coordinates, timegroup (generated by \code{group_times}) and additional grouping columns.
#'
#' The \code{threshold} must be provided in the units of the coordinates. The \code{threshold} must be larger than 0. The coordinates must be planar coordinates (e.g.: UTM). In the case of UTM, a \code{threshold} = 50 would indicate a 50m distance threshold.
#'
#' The \code{timegroup} argument is optional, but recommended to pair with \code{\link{group_times}}. The intended framework is to group rows temporally with \code{\link{group_times}} then spatially with \code{edge_nn} (or grouping functions).
#'
#' The \code{splitBy} argument offers further control over grouping. If within your \code{DT}, you have multiple populations, subgroups or other distinct parts, you can provide the name of the column which identifies them to \code{splitBy}. \code{edge_nn} will only consider rows within each \code{splitBy} subgroup.
#'
#' @param threshold (optional) spatial distance threshold to set maximum distance between an individual and their neighbour.
#' @inheritParams group_pts
#'
#' @return \code{edge_nn} returns a \code{data.table} with three columns: timegroup, ID and NN.
#'
#' The ID and NN columns represent the edges defined by the nearest neighbours (and temporal thresholds with \code{group_times}).
#'
#' If an individual was alone in a timegroup or splitBy, or did not have any neighbours within the threshold distance, they are assigned NA for nearest neighbour.
#'
#' @export
#'
#' @family Edge-list generation
#'
#' @examples
#' # Load data.table
#' library(data.table)
#'
#' # Read example data
#' DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))
#'
#' # Cast the character column to POSIXct
#' DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
#'
#' # Temporal grouping
#' group_times(DT, datetime = 'datetime', threshold = '20 minutes')
#'
#' # Edge list generation
#' edge_nn(DT, id = 'ID', coords = c('X', 'Y'),
#' timegroup = 'timegroup')
#'
#' # Edge list generation using maximum distance threshold
#' edge_nn(DT, id = 'ID', coords = c('X', 'Y'),
#' timegroup = 'timegroup', threshold = 100)
edge_nn <- function(DT = NULL,
id = NULL,
coords = NULL,
timegroup = NULL,
splitBy = NULL,
threshold = NULL) {
# NSE
N <- NULL

if (is.null(DT)) {
stop('input DT required')
}

if (!is.null(threshold)) {
if (!is.numeric(threshold)) {
stop('threshold must be numeric')
}
if (threshold <= 0) {
stop('threshold must be greater than 0')
}
}


if (is.null(id)) {
stop('ID field required')
}

if (length(coords) != 2) {
stop('coords requires a vector of column names for coordinates X and Y')
}

if (any(!(
c(timegroup, id, coords, splitBy) %in% colnames(DT)
))) {
stop(paste0(
as.character(paste(setdiff(
c(timegroup, id, coords, splitBy),
colnames(DT)
), collapse = ', ')),
' field(s) provided are not present in input DT'
))
}

if (any(!(DT[, vapply(.SD, is.numeric, TRUE), .SDcols = coords]))) {
stop('coords must be numeric')
}

if (!is.null(timegroup)) {
if (any(unlist(lapply(DT[, .SD, .SDcols = timegroup], class)) %in%
c('POSIXct', 'POSIXlt', 'Date', 'IDate', 'ITime', 'character'))) {
warning(
strwrap(
prefix = " ",
initial = "",
x = 'timegroup provided is a date/time
or character type, did you use group_times?'
)
)
}
}

if (is.null(timegroup) && is.null(splitBy)) {
splitBy <- NULL
} else {
splitBy <- c(splitBy, timegroup)
if (DT[, .N, by = c(id, splitBy, timegroup)][N > 1, sum(N)] != 0) {
warning(
strwrap(
prefix = " ",
initial = "",
x = 'found duplicate id in a
timegroup and/or splitBy -
does your group_times threshold match the fix rate?'
)
)
}
}

DT[, {

distMatrix <-
as.matrix(stats::dist(.SD[, 2:3], method = 'euclidean'))
diag(distMatrix) <- NA
if (is.null(threshold)) {
wm <- apply(distMatrix, MARGIN = 2, which.min)
} else {
distMatrix[distMatrix > threshold] <- NA
wm <- apply(distMatrix, MARGIN = 2,
function(x) ifelse(sum(!is.na(x)) > 0, which.min(x), NA))
}
list(ID = .SD[[1]][as.numeric(names(wm))]
NN = .SD[[1]][wm])
},
by = splitBy, .SDcols = c(id, coords)]

}

7 changes: 7 additions & 0 deletions R/spatsoc.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,13 @@
#' \item \code{\link{group_polys}}
#' }
#'
#' two edge list generating functions:
#'
#' \itemize{
#' \item \code{\link{edge_dist}}
#' \item \code{\link{edge_nn}}
#' }
#'
#' and two social network functions:
#' \itemize{
#' \item \code{\link{randomizations}}
Expand Down
3 changes: 3 additions & 0 deletions man/edge_dist.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

68 changes: 68 additions & 0 deletions man/edge_nn.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading