-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 'one_dir_freq' file read-in option #31
Add 'one_dir_freq' file read-in option #31
Conversation
# data_in_files may hold absolute or relative paths | ||
paths = [] | ||
for nc in data_in_files: | ||
full = '/'.join([data_in_direc, nc]).replace('//', '/') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The best way to do this is os.path.join(data_in_direc, nc)
(only recently learned that...you were probably just keeping consistent with the existing code. Do as I say, not as I do ;))
I have no problem with adding this functionality. But does it need to be its own method? Can the existing Better yet, can the |
Sure thing -- I considered this, but wasn't sure if it would be worth it, since we are likely going to change this process down the road. Since it's quick and easy I'll look into doing this for now. Perhaps I should move this discussion to an issue (and maybe you've thought about something along these lines already), but these are a bit of my thoughts going forward: While currently within Calc the two methods accomplish the same tasks, in an abstract sense the current
I would argue that the explicit read-in method is the most general way of doing things. With enough information, one could automatically generate an explicit file map from an implicit generator. In addition, there is nothing that says you couldn't relax the current single directory constraint and just map each variable (within a particular time frequency) to a full file path. To continue to support implicit read-in methods (for very structured output data, like Within a Run object one could then have a single argument for the file read-in method. The user could pass either the explicit dictionary mapping or they could pass an object that implements the FileMapGenerator interface. Within Calc, when reading in the files, you could have some simple logic that would be along the lines of: "if an explicit map is provided use the map; if not, use information about which variable you are looking for, and the interval in etc. and pass those as arguments to the generator, which would return an explicit map for just that variable." Using an interface would ensure that the explicit file map generated would always have the same structure (so that it could be used seamlessly within Calc). |
This should be ready to go now. You can now specify my dictionary example above under the |
Add 'one_dir_freq' file read-in option
Thanks, Spencer! Looks great. And bonus points for # lines deleted > # lines added |
I recognize that we ultimately want to refactor this process out of
calc.py
, but I needed something along these lines for dealing with idealized model output. The option I've added is called'one_dir_freq'
. It is a very slight modification of the existing'one_dir'
option; however in this case I leverage theintvl_in
attribute of Calc, much like the'gfdl'
option to enable the user to specify a series of files with different output frequencies (e.g. monthly, daily, 3hr etc.).Here's how one uses it within a Run object: