Skip to content

Commit

Permalink
Aug 29
Browse files Browse the repository at this point in the history
  • Loading branch information
chenyangkang committed Aug 29, 2023
1 parent be723c8 commit 83239b5
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 11 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@
Daily Abundance & phenology estimation using eBird citizen science data

## Brief introduction
Bird STEM is a AdaSTEM model for daily abundance estimation using eBird citizen science data. It leverage the "adjacency" information of surrounding bird observation in space and time, to predict the occurence and abundance of target spatial-temporal point. In the demo, we use a two-step hurdle model with XGBoostClassifier for occurence modeling and XGBoostRegressor for abundance modeling.
Bird STEM is an AdaSTEM model for daily abundance estimation using eBird citizen science data. It leverages the "adjacency" information of surrounding bird observation in space and time, to predict the occurence and abundance of target spatial-temporal point. In the demo, we use a two-step hurdle model as "base model", with XGBoostClassifier for occurence modeling and XGBoostRegressor for abundance modeling.

User can define the size of stixel (spatial temporal pixel) in terms of space and time. Larger stixel guarantee generalizability but lose precision in fine resolution; Smaller stixel may have better predictability and reduced extrapolability for points outside the stixel.
User can define the size of stixel (spatial temporal pixel) in terms of space and time. Larger stixel guarantees generalizability but loses precision in fine resolution; Smaller stixel may have better predictability in the exact area but reduced extrapolability for points outside the stixel.

In the demo, we first split the training data using temporal sliding windows with size of 50 DOY and step of 20 DOY (`temporal_start = 0`, `temporal_end=366`, `temporal_step=20`, `temporal_bin_interval = 50`). For each temporal slice, a spatial gridding is applied, where we force the stixel to be split into smaller 1/4 pieces if the edge is larger than 50 units (measured in longitude and latitude, `grid_len_lon_upper_threshold=50`, `grid_len_lat_upper_threshold=50`), and stop splitting to prevent the edge length to shrink to less than 10 units (`grid_len_lon_lower_threshold=10`, `grid_len_lat_lower_threshold=10`) or containing less than 50 checklists (`points_lower_threshold = 50`).
In the demo, we first split the training data using temporal sliding windows with size of 50 DOY and step of 20 DOY (`temporal_start = 0`, `temporal_end=366`, `temporal_step=20`, `temporal_bin_interval = 50`). For each temporal slice, a spatial gridding is applied, where we force the stixel to be split into smaller 1/4 pieces if the edge is larger than 50 units (measured in longitude and latitude, `grid_len_lon_upper_threshold=50`, `grid_len_lat_upper_threshold=50`), and stop splitting to prevent the edge length to shrink below 10 units (`grid_len_lon_lower_threshold=10`, `grid_len_lat_lower_threshold=10`) or containing less than 50 checklists (`points_lower_threshold = 50`).

This process is excecuted 10 times (`ensemble_fold = 10`), each time with random jitter and random rotation of the gridding, generating 10 ensembles. In the prediciton phase, only spatial-temporal points with more than 7 (`min_ensemble_required = 7`)ensemble usable are predicted (otherwise, set as `np.nan`)
This process is excecuted 10 times (`ensemble_fold = 10`), each time with random jitter and random rotation of the gridding, generating 10 ensembles. In the prediciton phase, only spatial-temporal points with more than 7 (`min_ensemble_required = 7`) ensembles usable are predicted (otherwise, set as `np.nan`).

Fitting and prediction follow the convention of sklearn `estimator` class:
Fitting and prediction methods follow the convention of sklearn `estimator` class:

```py
## fit
Expand All @@ -21,7 +21,7 @@ pred_mean, pred_std = model.predict(X_test)
pred_mean = np.where(pred_mean>0, pred_mean, 0)
```

Where the pred_mean and pred_std are the mean and standard deviation of the predicted value across ensembles.
Where the pred_mean and pred_std are the mean and standard deviation of the predicted values across ensembles.


## Full usage:
Expand Down
10 changes: 5 additions & 5 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@
Daily Abundance & phenology estimation using eBird citizen science data

## Brief introduction
Bird STEM is a AdaSTEM model for daily abundance estimation using eBird citizen science data. It leverage the "adjacency" information of surrounding bird observation in space and time, to predict the occurence and abundance of target spatial-temporal point. In the demo, we use a two-step hurdle model with XGBoostClassifier for occurence modeling and XGBoostRegressor for abundance modeling.
Bird STEM is an AdaSTEM model for daily abundance estimation using eBird citizen science data. It leverages the "adjacency" information of surrounding bird observation in space and time, to predict the occurence and abundance of target spatial-temporal point. In the demo, we use a two-step hurdle model as "base model", with XGBoostClassifier for occurence modeling and XGBoostRegressor for abundance modeling.

User can define the size of stixel (spatial temporal pixel) in terms of space and time. Larger stixel guarantee generalizability but lose precision in fine resolution; Smaller stixel may have better predictability and reduced extrapolability for points outside the stixel.
User can define the size of stixel (spatial temporal pixel) in terms of space and time. Larger stixel guarantees generalizability but loses precision in fine resolution; Smaller stixel may have better predictability in the exact area but reduced extrapolability for points outside the stixel.

In the demo, we first split the training data using temporal sliding windows with size of 50 DOY and step of 20 DOY (`temporal_start = 0`, `temporal_end=366`, `temporal_step=20`, `temporal_bin_interval = 50`). For each temporal slice, a spatial gridding is applied, where we force the stixel to be split into smaller 1/4 pieces if the edge is larger than 50 units (measured in longitude and latitude, `grid_len_lon_upper_threshold=50`, `grid_len_lat_upper_threshold=50`), and stop splitting to prevent the edge length to shrink to less than 10 units (`grid_len_lon_lower_threshold=10`, `grid_len_lat_lower_threshold=10`) or containing less than 50 checklists (`points_lower_threshold = 50`).
In the demo, we first split the training data using temporal sliding windows with size of 50 DOY and step of 20 DOY (`temporal_start = 0`, `temporal_end=366`, `temporal_step=20`, `temporal_bin_interval = 50`). For each temporal slice, a spatial gridding is applied, where we force the stixel to be split into smaller 1/4 pieces if the edge is larger than 50 units (measured in longitude and latitude, `grid_len_lon_upper_threshold=50`, `grid_len_lat_upper_threshold=50`), and stop splitting to prevent the edge length to shrink below 10 units (`grid_len_lon_lower_threshold=10`, `grid_len_lat_lower_threshold=10`) or containing less than 50 checklists (`points_lower_threshold = 50`).

This process is excecuted 10 times (`ensemble_fold = 10`), each time with random jitter and random rotation of the gridding, generating 10 ensembles. In the prediciton phase, only spatial-temporal points with more than 7 (`min_ensemble_required = 7`)ensemble usable are predicted (otherwise, set as `np.nan`)
This process is excecuted 10 times (`ensemble_fold = 10`), each time with random jitter and random rotation of the gridding, generating 10 ensembles. In the prediciton phase, only spatial-temporal points with more than 7 (`min_ensemble_required = 7`) ensembles usable are predicted (otherwise, set as `np.nan`).

Fitting and prediction follow the convention of sklearn `estimator` class:
Fitting and prediction methods follow the convention of sklearn `estimator` class:

```py
## fit
Expand Down

0 comments on commit 83239b5

Please sign in to comment.