Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Plotting Options

Norman Fomferra edited this page Jun 14, 2017 · 1 revision

The content of this page may be outdated. We have implemented Option A. We may implement Option C later, for interactive data exploration solely through the GUI.


Here are some thoughts on how to integrate 2D/3D charts (time-series, scatter, density, correlation) in the Cate Desktop front-end.

General Implementation Concepts

Option A: Invoking plotting operations in interactive mode

Users invoke plotting operations offered by the Cate Python API from the front-end's OPERATIONS panel. The charts are generated in interactive mode, i.e. a Python matplotlib call will open an independent window (which scientists using Python know very well). In addition, all Cate plotting operations have a file argument so that the produced figure can be saved in an image file. Plot modifications have to be done by modifying the operation step's parameters and re-executing the workflow step.

Advantages:

  1. No special front-end coding required. So this is the cheapest option possible.
  2. It is very easy to implement new plots, just add a new Python operation.
  3. Charting is done through the Python API. GUI, CLI and API will therefore work in exactly the same way.
  4. Chart generation is part of the workflow and therefore of the workspace. As such, the output is reproducible by just opening the workspace and executing its workflow.
  5. We may later add functionality to export the workflow to a Python script. Chart generation scripts could be produced that way.

Disadvantages:

  1. While an interactive plot window is active, the back-end call is blocked and cannot return. Users have to explicitly close the plot window.
  2. Such open plot windows cannot be controlled by the front-end because they are started from Python back-end process and there seems to be no API for managing such external plot windows.
  3. The interactive plot windows UIs opened by Python charting libraries such as matplotlib look usually very old-fashioned and to not harmonize well with the fresh look and feel of the Cate Desktop.

Option B: Invoking plotting operations in non-interactive mode

Same as option A, but the plots are generated in non-interactive mode. In addition, outputs of plotting operations could be special resources (image files / image streams) created from the, e.g. matplotlib, figures. Cate Desktop front-end displays them as non-functional, static image views. Any plot modifications have to be done by modifying the operation step's parameters and re-executing the workflow step.

Advantages:

  • All of option A, but without 1., because we need to implement the static image views.

Disadvantage:

  1. Static image output, no user interaction with the figure. This might be not what users expect from a modern scientific GUI. Especially when taking into account what users can expect from today's web interfaces.
  2. To change plot, users must invoke operation once more, which might not be very obvious and user-friendly.

Option C: Interactive plotting in the front-end, equivalent plotting operations in the API

The front-end provides integrated, highly interactive charts (enabled by JavaScript packages such as D3.js and plotly.js). Data sources for the plots are selected variables (xr.DataArray, gpd.GeoSeries, or pd.Series CDM objects). A cate-webapi REST call allows Cate Desktop getting data subsets of the variables to be plotted in the front-end. This is similar to the concept we use for generating image layers displayed on the 2D map / 3D globe for selected variables (xr.DataArray CDM objects).

In addition, Cate's Python API provides equivalent plotting operations so that a plot generation can be part of any workflow and can be invoked from the CLI. For example, when we decide to use plotly.js for the front-end, there is the Plotly Python Library, which we can use in our Cate Python API so that the interactive charts used in the front-end will look similar to the static plots produced by the Cate Python API and CLI. Another example is Bokeh, a Python library which generates interactive plots using HTML/JavaScript.

Advantages:

  1. Users get integrated, interactive, high-quality graphics (which they may and can expect)
  2. Chart rendering is done by web-technology and available JavaScript packages allow for high quality graphics through exploiting the power of SVG, WebGL capabilities, and GPU processing (3D plots!)
  3. Chart data is also available in the front-end which allows for high performance allowing for additional operations (interactive display of values, copy to clipboard)

Disadvantages:

  1. Chart generation is mainly the result of invoking some front-end action rather then being the result of executing a workflow step (which must be an invocation of a Cate back-end operation). The advantages of having the charts in the workflow are described above.
  2. If we want to offer chart persistence to users, an additional mechanism would have to be developed for store such front-end generated charts in the workspace. This is similar to what we (will) do with the image layers generated from displayed on the 2D map and 3D globe views.

Display of Charts

How and where do we display the images of option B and the charts of option C in the front-end?

Options:

  • Open new plot windows independent of the front-end's main frame window
  • Views docked into the front-end's main frame window, next to the 2D map/3D globe views. This options allows also for layouting multiple windows, e.g. 1x1, 2x1, 1x2, 2x2, 2x3, 3x2.

Other Remarks

  • It should be possible to open a time-series plot from a context menu that opens by right-clicking a point on the 2D map / 3D globe. (from Kevin)
  • It should be possible to add/remove variables to/from existing charts if the chart type allows for it (e.g. multi-variate time series). It may then also be good if users can create empty plot windows to which they can add data sources. Is this for option C only?

Python Charting Libraries

JS Charting Libraries

Other links

Library Selection

matplotlib

Pro:

  • de-facto charting library standard in Python
  • already included in Cate distribution

Contra:

  • average plot quality
  • high effort to style plots so they look good
  • to allow interactive mode in Cate front-end, we must use an approach such as FigureManagerWebAgg matplotlib backend, which is an extra effort

seaborn

Pro:

  • adding HQ charts to matplotlib
  • very clear API for diverse scientific chart types which all look publication ready
  • small package which adds no extra dependencies

Contra:

  • to allow interactive mode in Cate front-end, we must use an approach such as FigureManagerWebAgg matplotlib backend, which is an extra effort

bokeh

Pro:

  • uses same software design as Cate: Python back-end + JS front-end
  • already included in Cate distribution (as transitive dependency - why?)
  • client side rendering: after all data is loaded into front-end (slow?) chart updates and rendering is fast and responsive
  • user-friendly interactive mode both in generated HTML pages and in IPython notebooks

Contra:

  • can only generate HTML pages, no image export in batch mode, only from front-end UI. This makes bokeh not so well suitable for workflow intregration
  • client side rendering: all data to be plotted must be loaded into front-end which may be slow and require additional memory

plotly

Pro:

  • High quality graphics
  • user-friendly interactive mode by plotly.js
  • same plots in front-end JavaScript and back-end Python
  • large number of HQ chart types to choose from

Contra:

  • Note entirely free, must use API access keys
  • Not really open source: a single company develops the library