Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: move plotting funcs to pd.plotting #13579

Closed
wants to merge 5 commits into from

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Jul 7, 2016

Rebased in #16005


cleanup tools/plotting as plotting subpackage. Because the differences are being huge, would like to split 3 part if base direction is OK:

1. Move plotting related tests under pandas/tests/plotting splitting to Series, DataFrame, etc. (0.18.2) (#13621)
2. Create pd.plotting and move related files (0.20.0)
3. Split / refactor pd.plotting to use small utility functions / property control classes. Add tests. (0.20.0)

@sinhrks sinhrks added this to the 0.19.0 milestone Jul 7, 2016
@codecov-io
Copy link

codecov-io commented Jul 7, 2016

Current coverage is 85.68% (diff: 69.35%)

Merging #13579 into master will increase coverage by <.01%

@@             master     #13579   diff @@
==========================================
  Files           145        154     +9   
  Lines         51419      51464    +45   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          44057      44098    +41   
- Misses         7362       7366     +4   
  Partials          0          0          

Powered by Codecov. Last update 97fd744...1c1efe1

@jorisvandenbossche
Copy link
Member

@sinhrks Generally sounds like a good plan!

@TomAugspurger
Copy link
Contributor

+1 overall. @sinhrks will this break old code? Does pandas.tools.plotting just point to pandas.plotting? I looked in the diff, but could have missed it.

try:
import matplotlib as mpl
return (str(mpl.__version__) <= LooseVersion('1.2.1') and
str(mpl.__version__)[0] != '0')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to add these checks for '0' when matplotlib's versioneer was broken for 1.5. I believe it's fixed now. Do you want to remove them in here, or a separate PR?

@sinhrks sinhrks force-pushed the plot_api branch 2 times, most recently from 477827a to e184485 Compare July 22, 2016 12:38
@jorisvandenbossche
Copy link
Member

@sinhrks Can you rebase this? Further it is mainly docs that is still needed?

@jreback jreback removed this from the 0.19.0 milestone Aug 26, 2016
@jreback
Copy link
Contributor

jreback commented Aug 26, 2016

@sinhrks as usual, certainly can include in 0.19.0, just trying to clear the decks.

@sinhrks sinhrks force-pushed the plot_api branch 5 times, most recently from 874bc5a to b2eebed Compare September 4, 2016 03:23
@jreback
Copy link
Contributor

jreback commented Nov 16, 2016

@sinhrks can you update

@jorisvandenbossche
Copy link
Member

@sinhrks do you time to update this?

@jorisvandenbossche
Copy link
Member

@sinhrks Do you have time to update this? I still think this is a useful refactor. I can also do a rebase to get it in.

@sinhrks
Copy link
Member Author

sinhrks commented Jan 8, 2017

Once updated to the latest master. Let me add release note later.

subplot.format_coord = lambda t, y: (
"t = {0} y = {1:8f}".format(Period(ordinal=int(t), freq=freq), y))

pylab.draw_if_interactive()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason you left this file? (for some sort of compat)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or I see, you may have left all of the pandas/tseries namespace as is (and just moved things), ok then.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

And sorry, I merged a PR on the converters. I had better waited to merge that one until this is merged, as you now have to rebase again (for the converters.py file)

Some other things:

  • I would personally move tests/plotting also into this new module (so plotting/tests)
  • Should we raise a deprecation warning on the old namespaces? So we can remove them sometime.

@@ -0,0 +1,20 @@
"""
Plotting api
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a bit the style in other pandas modules as well, but can't we just put this in __init__.py ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is cleaner here actually. And it is only imported explicity (unlike __init__ which is implicity imported)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with implicitly imported?

The difference with the other modules where we use this pattern, is that, eg in pandas.io, pandas.io.api are those functions that are imported top-level (imported in pandas/__init__.py), not what makes up the pandas.io namespace itself. Here, we don't have a collection of functions (put in api.py) which we want to have top level, but we just have the pandas.io namespace. So IMO it makes more sense and is cleaner to have it just in the init file here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I mean is, unless you explicity reference these imports, like from pandas.plotting import .....) you would by-default actually import anything.

thoughI guess for back-compat this is needed.

IOW, the imports for things like matplotlib would only be done when the pandas.plotting namespace is used (IOW, you try to plot something), so the import is done lazily later, rather than on startup now (to register converters and such).

# flake8: noqa

try: # mpl optional
from pandas.plotting import converter as conv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave out the conv (now both conv and converter in pandas.plotting namespace)

andrews_curves, bootstrap_plot,
parallel_coordinates, lag_plot,
autocorrelation_plot)
from pandas.plotting.plotting import (boxplot, scatter_plot, grouped_hist,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the most important :-), but the plotting.plotting looks a bit silly. Maybe we can call this file base or core to make it more clear?

I am also not sure if grouped_hist, hist_frame, hist_series should be exposed in the public pandas.plotting namespace?

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments on the deprecation.

Wondering, should we deprecate the top-level pd.scatter_matrix as well? (pointing to pd.plotting.scatter_matrix)

def outer(t=t):

def wrapper(*args, **kwargs):
warnings.warn("pandas.tools.plotting{t} is deprecated. "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plotting{t} -> plotting.{t}

warnings.warn("pandas.tools.plotting{t} is deprecated. "
"import from the "
"pandas.plotting.{t} instead".format(t=t),
UserWarning, stacklevel=3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this a FutureWarning instead of UserWarning (I think FutureWarning is OK here, as most of those functions will be used by users directly)

def wrapper(*args, **kwargs):
warnings.warn("pandas.tools.plotting{t} is deprecated. "
"import from the "
"pandas.plotting.{t} instead".format(t=t),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "from the" can be left out.

I would do either:
"pandas.tools.plotting.scatter_matrix is deprecated, import pandas.plotting.scatter_matrix instead"
or
"pandas.tools.plotting.scatter_matrix is deprecated, import from pandas.plotting instead"

@sinhrks
Copy link
Member Author

sinhrks commented Jan 21, 2017

Thx, updated based on the comments.

  • tests are moved to pandas/plotting/tests.
  • pd.scatter_matrix has been deprecated, in favor of pd.plotting.scatter_matrix.
  • added tests to check warnings are properly raised.

@sinhrks sinhrks changed the title (WIP) CLN: move plotting funcs to pd.plotting CLN: move plotting funcs to pd.plotting Jan 21, 2017
top namespace ``pandas.plotting``. All the public plotting functions should be available
from ``pandas.plotting``.

Also, ``scatter_matrix`` function imported directly under ``pandas`` namespace is also deprecated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas.scatter_matrix

Deprecate .plotting
^^^^^^^^^^^^^^^^^^^

``pandas.tools.plotting`` module has been deprecated, moving directory under the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pandas.tools.plotting module has been deprecated, in favor of the top level namespace pandas.plotting

Previous script:

.. code-block:: python

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add import pandas as pd

from pandas.tools.plotting import scatter_matrix, plot_params

# deprecate tools.plotting, and scatter_matrix on the top namespace
import pandas.tools.plotting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can actually deprecate pandas.tools.plotting with pandas.util.deprecate_module

import pandas.tools.plotting
from pandas.plotting import plot_params
# do not import deprecate to top namespace
scatter_matrix = pandas.util.decorators.deprecate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason NOT to remove plot_parms as well from top-level?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on moving plot_params as well

@jreback
Copy link
Contributor

jreback commented Jan 21, 2017

really just some minor comments. ok with merging, then fixing after if it is easier (one test failure on locale on travis, not sure why that is).

parallel_coordinates, lag_plot,
autocorrelation_plot)
from pandas.plotting.core import (boxplot, scatter_plot, grouped_hist,
hist_frame, hist_series)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I mentioned this before, but is there use in exposing grouped_hist, hist_frame, hist_series as public functions?
There are also not documented I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation argument is actually also true for boxplot and scatter_plot, and for both the functionality is also available in DataFrame.plot.scatter or DataFrame.boxplot/plot.box

@jreback
Copy link
Contributor

jreback commented Mar 20, 2017

@sinhrks seems really close...can you rebase

@jorisvandenbossche
Copy link
Member

I rebased this one, but pushed to a new PR: #16005

(this PR predates from when you can push to PR's branch)

@jorisvandenbossche jorisvandenbossche added this to the No action milestone Apr 14, 2017
@@ -823,7 +823,7 @@ before plotting.
Plotting Tools
--------------

These functions can be imported from ``pandas.tools.plotting``
These functions can be imported from ``pandas.plotting``
and take a :class:`Series` or :class:`DataFrame` as an argument.

.. _visualization.scatter_matrix:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from pandas.tools.plotting import scatter_matrix
from matplotlib import cm
feature_names = ['mass', 'width', 'height', 'color_score']
X = fruits[feature_names]
y = fruits['fruit_label']
cmap = cm.get_cmap('gnuplot')
scatter = pd.scatter_matrix(X, c = y, marker = 'o', s=40, hist_kwds={'bins':15}, figsize=(9,9), cmap = cmap)
plt.suptitle('Scatter-matrix for each input variable')
plt.savefig('fruits_scatter_matrix')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CLN: Cleanup plotting submodule
6 participants