Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AERONET parallel different dates #100

Closed
zmoon opened this issue Feb 13, 2023 · 2 comments
Closed

AERONET parallel different dates #100

zmoon opened this issue Feb 13, 2023 · 2 comments
Assignees
Labels
bug Something isn't working in-develop Addressed/fixed/resolved in `develop` branch

Comments

@zmoon
Copy link
Member

zmoon commented Feb 13, 2023

Currently you get extra day of data if you use parallel cf. serial. I made a note about this sometime ago

days = pd.date_range(start=min_date, end=max_date, freq="D") # TODO: subtract 1?
days1 = days + pd.Timedelta(days=1)

@zmoon zmoon added the bug Something isn't working label Feb 13, 2023
zmoon referenced this issue in zmoon/MELODIES-MONET Feb 13, 2023
@zmoon zmoon self-assigned this Feb 14, 2023
@zmoon
Copy link
Member Author

zmoon commented Jul 11, 2023

Started addressing this and discovered something weird. Splitting a two day request we get less data.

import pandas as pd

from monetio import aeronet

# One request
t = pd.to_datetime(["2019/09/01", "2019/09/02", "2019/09/03"])
df1 = aeronet.add_data(t)
assert not df1.duplicated().any()

# Split
df2a = aeronet.add_data(t[:2])
df2b = aeronet.add_data(t[1:])
assert not df2a.duplicated().any()
assert not df2b.duplicated().any()

assert len(df1) > len(df2a) + len(df2b)

# Which rows are only in the one-request results?
df2 = pd.concat([df2a, df2b], ignore_index=True)
df_all = df1.merge(df2, on=["siteid", "time"], how="left", indicator=True)
assert df_all._merge.value_counts()["right_only"] == 0
df_ = df_all.query("_merge == 'left_only'")
print(len(df_), "rows unique to the single request version")
print(df_.time.min(), "...", df_.time.max())
print(sorted(df_.siteid.unique()))
Reading Aeronet Data...
Reading Aeronet Data...
Reading Aeronet Data...
670 rows unique to the single request version
2019-09-02 00:00:10 ... 2019-09-02 03:47:57
['ARM_SGP', 'Bakersfield', 'CalTech', 'Cascade_Airport', 'Cliff_Creek_1', 'Cliff_Creek_2', 'Cliff_Creek_3', 'Cliff_Creek_4', 'Cliff_Creek_5', 'Cliff_Creek_6', 'Fort_McMurray', 'Fresno_2', 'Grizzly_Bay', 'Kelowna_UAS', 'Kluane_Lake', 'MAXAR_FUTON', 'McCall_AB_Standard', 'McCall_Dragon_1', 'McCall_Dragon_3', 'McCall_Dragon_4', 'McCall_Dragon_5', 'McCall_Dragon_6', 'McCall_Dragon_8', 'Meridian_DEQ', 'Missoula', 'Missoula_Health_Dpt', 'Missoula_Pt_Six', 'Missoula_Waterworks', 'Monterey', 'NASA_Ames', 'NEON_BONA', 'NEON_CLBJ', 'NEON_CVALLA', 'NEON_HEAL', 'NEON_MOAB', 'NEON_NIWO', 'NEON_OAES', 'NEON_ONAQ', 'NEON_SJER', 'NEON_Sterling', 'NEON_TOOL', 'NEON_WOOD', 'NEON_WREF', 'NEON_YELL', 'PNNL', 'Pinehurst_Idaho', 'Railroad_Valley2', 'Red_Mountain_Pass', 'Rexburg_Idaho', 'Rimrock', 'SDSU_IPLab', 'Saturn_Island', 'TABLE_MOUNTAIN_CA', 'Taylor_Ranch_TWRS', 'UACJ_UNAM_ORS', 'Univ_of_Houston', 'Univ_of_Lethbridge', 'Univ_of_Nevada-Reno', 'White_Sands_HELSTF']

Corresponding URLs:

32302 - (15224 + 16415) = 663. This seems to indicate that this is a problem with the web service not the reader. But accounting for the header lines we would get 669, not 670, so not quite consistent with the reader results (but maybe my line count is off by one).

@zmoon
Copy link
Member Author

zmoon commented Sep 20, 2023

Ilya from AERONET has fixed the above issue. Seems that a few rows were missing from the beginning of each AERONET request result, but now fixed.

@zmoon zmoon added the in-develop Addressed/fixed/resolved in `develop` branch label Oct 3, 2023
@zmoon zmoon closed this as completed Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working in-develop Addressed/fixed/resolved in `develop` branch
Projects
None yet
Development

No branches or pull requests

1 participant