-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AERONET parallel different dates #100
Comments
Started addressing this and discovered something weird. Splitting a two day request we get less data. import pandas as pd
from monetio import aeronet
# One request
t = pd.to_datetime(["2019/09/01", "2019/09/02", "2019/09/03"])
df1 = aeronet.add_data(t)
assert not df1.duplicated().any()
# Split
df2a = aeronet.add_data(t[:2])
df2b = aeronet.add_data(t[1:])
assert not df2a.duplicated().any()
assert not df2b.duplicated().any()
assert len(df1) > len(df2a) + len(df2b)
# Which rows are only in the one-request results?
df2 = pd.concat([df2a, df2b], ignore_index=True)
df_all = df1.merge(df2, on=["siteid", "time"], how="left", indicator=True)
assert df_all._merge.value_counts()["right_only"] == 0
df_ = df_all.query("_merge == 'left_only'")
print(len(df_), "rows unique to the single request version")
print(df_.time.min(), "...", df_.time.max())
print(sorted(df_.siteid.unique()))
Corresponding URLs:
32302 - (15224 + 16415) = 663. This seems to indicate that this is a problem with the web service not the reader. But accounting for the header lines we would get 669, not 670, so not quite consistent with the reader results (but maybe my line count is off by one). |
Ilya from AERONET has fixed the above issue. Seems that a few rows were missing from the beginning of each AERONET request result, but now fixed. |
Currently you get extra day of data if you use parallel cf. serial. I made a note about this sometime ago
monetio/monetio/obs/aeronet.py
Lines 154 to 155 in 9c81765
The text was updated successfully, but these errors were encountered: