Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CAWDL for new website data availability; pause CASGEM module #105

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

narlesky
Copy link
Member

@narlesky narlesky commented Dec 1, 2023

Related Issues

Closes #70
Addresses #98

Description

Extend collect.dwr.cawdl module for compatibility with data availability per the new data endpoints (CAWDL continuous data has migrated to https://wdlstorageaccount.blob.core.windows.net). Continuous data records encompass both surface and ground water records, so methods are refactored to be used for both data streams.

Temporarily remove collect.dwr.casgem module - collect methods are obsolete due to changes in CASGEM website structure. This will be revived when public endpoints are available via CASGEM's website.

Breaking Changes

Key changes in collect.dwr.cawdl:

  • adds get_cawdl_site_detail_url: helper function to produce URL for station detail page (interactive web page)
  • adds get_cawdl_site_report_url: helper function to produce URL for text site report
  • adds get_cawdl_continuous_data_url: helper function to produce URL for timeseries
  • adds get_cawdl_dataset_overview: access to summary datasets on CNRA open data portal related to CAWDL records
  • removes get_cawdl_data: continuous data records appear to be the main
  • replaces get_cawdl_surface_water_data with get_cawdl_continuous_data and provided start and end date filters
  • replaces get_cawdl_surface_water_por with get_cawdl_continuous_data
  • replaces get_cawdl_surface_water_site_report with get_cawdl_site_detail and get_cawdl_continuous_data_site_report

Key changes in collect.dwr.casgem:

  • renames casgem.casgem_scraper to casgem.casgem
  • removes selenium dependency
  • refactors casgem.get_casgem_data to raise NotImplementedError

Example Usage

import datetime as dt
from pprint import pprint
from collect.dwr import cawdl

# surface water station example
result = cawdl.get_cawdl_continuous_data(
    'B05155',
    'Flow',
    'Daily_Mean',
    start=dt.datetime(2020, 1, 1),
    end=dt.datetime(2020, 2, 1)
)

print(result['info']['rating_tables'])
print(result['data'].head())

pprint(result)

# well example
result = get_cawdl_continuous_data(
    '01N04E36Q001M',
    'Groundwater_Level_Below_Ground_Surface',
    'Daily_Mean',
    start=dt.datetime(2020, 1, 1),
    end=dt.datetime(2020, 2, 1)
)

print(result['info']['published'])
print(result['info']['period_of_record_archive'])
print(result['data'].head())

pprint(result)

@narlesky narlesky added bug Something isn't working feeds support a new data feed ready for review labels Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feeds support a new data feed ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CAWDL surface water bug
1 participant