Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

TODO forman

Norman Fomferra edited this page Oct 12, 2016 · 18 revisions

This page lists any ECT ideas, tasks, issues that come into my mind regarding the upcoming ECT release. Some of them can already be found in the ECT Issue Tracker while others are temporary tasks that don't need a representation as a publicly visible issue.

Prioritization:

  • B - blocker: no release without this
  • H - high: severe loss of functionality if not considered
  • M - medium: if not now, maybe next release
  • L - low: nice to have

API requirements

Open

  • H: Write Op development guide including descriptions of the various properties to be used
  • H: milestone v2: Model image outputs for CLI (write image) and WebAPI (start image tile service)
  • M: milestone v2: op: rename OpRegistration --> Operation
  • M: milestone v2: workflow: rename invoke() --> execute()
  • M: remove most of now unnecessary @op_input() calls, they slow down import of ect.ops modules
  • M: io: find_reader/find_writer: if no file format given: extract ext and find all formats with that ext. Then continue as if format was given.
  • M: op: skip operation of type Class. Rather, allow for registration of bound methods (instance functions)
  • M: op: derive additional operation header info: file, module, version, etc
  • L: op: support positional arguments

Done

  • B: Use NodePort.has_value when converting workflow to JSON (blocker because executed workspaces may include non-serializable values)
  • H: Ops: make sure all ops use "simple" values which can be converted from/to text or JSON. The @op_input decorators could make appropriate checks.
  • H: Ops: tag them all!
  • H: Ops: provide a "var" function, which will be important for workflows/workspaces: def variable(ds:Dataset, name:str) -> DataArray AND/OR introduce a common op argument "variable: Union[None, str, List[str]]" which will be used to select variables
  • H: Workflow: add str method to all Nodes, so that command "ect ws status" can display nicely the workspace's workflow state
  • M: extract magic constants from code and identify those which may become ECT configuration settings, mark them by # {{ect-config}}
  • M: Op: replace input property 'required' (bool) by 'position' (int)
  • M: Op: perform same input validation for ops and workflows
  • M: rename Monitor.NULL to Monitor.NONE
  • M: Op: @op(aliases=[name1, ...]) --> new solution: use short name "XXX" for "ect.ops.YYY.XXX"
  • M: util/io/cli: we should harmonize/revise time range usage, e.g. use of TimeRange tuple, use of date and datetime instances
  • L: So far we don't have any own exception types (partly DONE: we have WorkflowError, CommandError now)
  • L: WorkspaceManager: use base directory to resolve relative paths

CLI requirements

Open

  • H: "ect ws run OP" --> once this works fine, we may consider "ect res plot", "ect res print", "ect res write" just delegate to "ect ws run OP" with OP=plot, print, write
  • H: "ect ds sync ..." --> should be able to configure ECT data root directory, by default it is '~/.ect/data_stores' (#41)
  • H: "ect res plot" should be able to also plot multi-variables, plots shall include images (or "ect res imshow")
  • H: "ect res rename N1 N2"
  • H: "ect op tags" shows all operation tags in use with their frequencies
  • H: "ect op add WORKFLOW" adds a new WORKFLOW to operation registry
  • H: "ect op rem WORKFLOW" removes a WORKFLOW from operation registry
  • M: allow to configure visible data stores
  • M: develop generic data stores, e.g. based on OPeNDAP, so users can add new data stores by configuring the generic ones
  • M: "ect help" would lists all the sub-commands at once
  • M: "ect ds list" should also show timestamp of last remote metadata fetch
  • M: "ect ds list --fetch" forces fetching new up-to-date metadata (rather than using cache)
  • M: "ect ds list --stores" would lists all configured data stores
  • M: "ect ds list --tag TAG" would lists all data sources whose metadata fields textually contain TAG
  • M: Adapt progress bar length to terminal size, use shutil.get_terminal_size()
  • M: "ect ds del DS" to get rid of locally cached datasets
  • M: "ect ds info DS" should print sync status (temporal coverage required) incl. local data allocation (#14, #16)
  • M: "ect ds dashboard" --> ASCII art time coverage overview, cool!
  • M: CLI needs auto-completion for many commands that require input of long, unhandy names (data sources, operations, variables)
  • L: "ect res read NAME FILE [FORMAT] ..." --> Support reader-specific arguments (...)
  • L: "ect op register [--global|-g] WORKFLOW"
  • L: "ect ds list -q variable:temperature" --> use Lucense/Solr-like query syntax

Done

  • B: "ect res open" should allow for accessing a local dataset comprising multiple files (#41)
  • B: when an ect command starts the service and then fails to execute the actual command, the service is still up and running --> auto close required, must detect service inactivity
  • H: service logging shall not occur on stdout if it is called by other than ect-webapi
  • H: "ect ws list" should list all open workspaces & if they are modified/saved
  • H: "ect res plot" blocks the caller --> service response timeout
  • H: "ect res del"
  • H: "ect run" --> if return type of op is NoneType, we should not write anything to terminal
  • H: "ect res set" shall validate OP arguments #25
  • H: "ect res set" may overwrite existing res, if possible
  • H: bug: "ect op list --tag" still expects wildcard pattern
  • H: Syncing ds should occur automatically when DS[,START[,END]] is used
  • H: Print available format names, ideally detect input and/or output format (reader and/or writer), #17
  • H: "ect ws status [WS]" must print nicely all workflow steps (SoW requirement!)
  • H: "ect op info OP" must print op parameters and return values
  • H: Catch exceptions when calling into ECT API functions, print kind error messages, add -e option to print stack traces
  • H: Print data sources (from ODP) and a data source's variable names
  • H: "ect run --read p=2010_precipitation.nc --write ts.nc ect.ops.timeseries.timeseries ds=p lat=53 lon=10"
  • H: Workspaces: "ect init" "ect read p 2010_precipitation.nc" "ect set ts ect.ops.timeseries.timeseries ds=p lat=53 lon=10 "ect write ts ts.nc
  • M: Harmonize all error messages and their formats
  • M: Use common query syntax to search and explore things that can be listed

WebAPI service requirements

Open

  • H: All "ect res" implementations in the WebAPI must execute asynchronously, therefore WebAPI requires a get_workspace_resource_state() which would be fed into a special Monitor (#51)

Done

  • H: service logging shall not occur on stdout if it is called by other than ect-webapi
  • H: WebAPI shall write log file, currently it prints to the console
  • B: when service process starts, its CWD remains the initial one, although the CLI's CWD changes which results in different interpretations of "."
  • H: service exceptions shall be reported by tracebacks

Other

Open

  • M: automated system testing approach (all)
  • H: giant ops clean-up (Janis)
    • filter --> select, also make sure aux-variables are included in the output dataset (option?)
    • make sure all ops use "simple" values which can be easily converted from/to text or JSON
    • tag all operations
    • us variable name 'ds', 'ds_' for xr.Dataset arguments
    • make all unit-tests fast

Done

  • H: Create software installer
  • H: gridtools geom arguments (Norman, Janis)
  • H: 'sphinx_autodoc_annotation', so that Python 3 type annotations go into docs (DONE)
  • H: cli with workspaces (Marco + Norman)