Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into set_accessor
Browse files Browse the repository at this point in the history
  • Loading branch information
h-vetinari committed Jun 19, 2018
2 parents b46611f + 5fbb683 commit 64b4bd0
Show file tree
Hide file tree
Showing 165 changed files with 5,271 additions and 1,461 deletions.
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,4 @@ doc:
cd doc; \
python make.py clean; \
python make.py html
python make.py spellcheck
13 changes: 13 additions & 0 deletions asv_bench/benchmarks/categoricals.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,3 +193,16 @@ def time_categorical_series_is_monotonic_increasing(self):

def time_categorical_series_is_monotonic_decreasing(self):
self.s.is_monotonic_decreasing


class Contains(object):

goal_time = 0.2

def setup(self):
N = 10**5
self.ci = tm.makeCategoricalIndex(N)
self.cat = self.ci.categories[0]

def time_contains(self):
self.cat in self.ci
21 changes: 20 additions & 1 deletion asv_bench/benchmarks/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

import numpy as np
from pandas import (DataFrame, Series, MultiIndex, date_range, period_range,
TimeGrouper, Categorical)
TimeGrouper, Categorical, Timestamp)
import pandas.util.testing as tm

from .pandas_vb_common import setup # noqa
Expand Down Expand Up @@ -385,6 +385,25 @@ def time_dtype_as_field(self, dtype, method, application):
self.as_field_method()


class RankWithTies(object):
# GH 21237
goal_time = 0.2
param_names = ['dtype', 'tie_method']
params = [['float64', 'float32', 'int64', 'datetime64'],
['first', 'average', 'dense', 'min', 'max']]

def setup(self, dtype, tie_method):
N = 10**4
if dtype == 'datetime64':
data = np.array([Timestamp("2011/01/01")] * N, dtype=dtype)
else:
data = np.array([1] * N, dtype=dtype)
self.df = DataFrame({'values': data, 'key': ['foo'] * N})

def time_rank_ties(self, dtype, tie_method):
self.df.groupby('key').rank(method=tie_method)


class Float32(object):
# GH 13335
goal_time = 0.2
Expand Down
1 change: 1 addition & 0 deletions ci/environment-dev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@ dependencies:
- pytz
- setuptools>=24.2.0
- sphinx
- sphinxcontrib-spelling
1 change: 1 addition & 0 deletions ci/requirements_dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ python-dateutil>=2.5.0
pytz
setuptools>=24.2.0
sphinx
sphinxcontrib-spelling
17 changes: 15 additions & 2 deletions doc/make.py
Original file line number Diff line number Diff line change
Expand Up @@ -224,8 +224,9 @@ def _sphinx_build(self, kind):
--------
>>> DocBuilder(num_jobs=4)._sphinx_build('html')
"""
if kind not in ('html', 'latex'):
raise ValueError('kind must be html or latex, not {}'.format(kind))
if kind not in ('html', 'latex', 'spelling'):
raise ValueError('kind must be html, latex or '
'spelling, not {}'.format(kind))

self._run_os('sphinx-build',
'-j{}'.format(self.num_jobs),
Expand Down Expand Up @@ -304,6 +305,18 @@ def zip_html(self):
'-q',
*fnames)

def spellcheck(self):
"""Spell check the documentation."""
self._sphinx_build('spelling')
output_location = os.path.join('build', 'spelling', 'output.txt')
with open(output_location) as output:
lines = output.readlines()
if lines:
raise SyntaxError(
'Found misspelled words.'
' Check pandas/doc/build/spelling/output.txt'
' for more details.')


def main():
cmds = [method for method in dir(DocBuilder) if not method.startswith('_')]
Expand Down
Binary file added doc/source/_static/favicon.ico
Binary file not shown.
4 changes: 2 additions & 2 deletions doc/source/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,7 @@ As usual, **both sides** of the slicers are included as this is label indexing.
columns=micolumns).sort_index().sort_index(axis=1)
dfmi
Basic multi-index slicing using slices, lists, and labels.
Basic MultiIndex slicing using slices, lists, and labels.

.. ipython:: python
Expand Down Expand Up @@ -1039,7 +1039,7 @@ On the other hand, if the index is not monotonic, then both slice bounds must be
KeyError: 'Cannot get right slice bound for non-unique label: 3'
:meth:`Index.is_monotonic_increasing` and :meth:`Index.is_monotonic_decreasing` only check that
an index is weakly monotonic. To check for strict montonicity, you can combine one of those with
an index is weakly monotonic. To check for strict monotonicity, you can combine one of those with
:meth:`Index.is_unique`
.. ipython:: python
Expand Down
16 changes: 8 additions & 8 deletions doc/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ either match on the *index* or *columns* via the **axis** keyword:
df_orig = df
Furthermore you can align a level of a multi-indexed DataFrame with a Series.
Furthermore you can align a level of a MultiIndexed DataFrame with a Series.

.. ipython:: python
Expand Down Expand Up @@ -593,7 +593,7 @@ categorical columns:
frame = pd.DataFrame({'a': ['Yes', 'Yes', 'No', 'No'], 'b': range(4)})
frame.describe()
This behaviour can be controlled by providing a list of types as ``include``/``exclude``
This behavior can be controlled by providing a list of types as ``include``/``exclude``
arguments. The special value ``all`` can also be used:

.. ipython:: python
Expand Down Expand Up @@ -1034,7 +1034,7 @@ Passing a single function to ``.transform()`` with a ``Series`` will yield a sin
Transform with multiple functions
+++++++++++++++++++++++++++++++++

Passing multiple functions will yield a column multi-indexed DataFrame.
Passing multiple functions will yield a column MultiIndexed DataFrame.
The first level will be the original frame column names; the second level
will be the names of the transforming functions.

Expand All @@ -1060,7 +1060,7 @@ Passing a dict of functions will allow selective transforming per column.
tsdf.transform({'A': np.abs, 'B': lambda x: x+1})
Passing a dict of lists will generate a multi-indexed DataFrame with these
Passing a dict of lists will generate a MultiIndexed DataFrame with these
selective transforms.

.. ipython:: python
Expand Down Expand Up @@ -1889,12 +1889,12 @@ faster than sorting the entire Series and calling ``head(n)`` on the result.
df.nsmallest(5, ['a', 'c'])
.. _basics.multi-index_sorting:
.. _basics.multiindex_sorting:

Sorting by a multi-index column
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sorting by a MultiIndex column
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You must be explicit about sorting when the column is a multi-index, and fully specify
You must be explicit about sorting when the column is a MultiIndex, and fully specify
all levels to ``by``.

.. ipython:: python
Expand Down
25 changes: 19 additions & 6 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,11 @@
import re
import inspect
import importlib
from sphinx.ext.autosummary import _import_by_name
import logging
import warnings
from sphinx.ext.autosummary import _import_by_name

logger = logging.getLogger(__name__)

try:
raw_input # Python 2
Expand Down Expand Up @@ -75,8 +77,19 @@
'nbsphinx',
]

try:
import sphinxcontrib.spelling # noqa
except ImportError as err:
logger.warn(('sphinxcontrib.spelling failed to import with error "{}". '
'`spellcheck` command is not available.'.format(err)))
else:
extensions.append('sphinxcontrib.spelling')

exclude_patterns = ['**.ipynb_checkpoints']

spelling_word_list_filename = ['spelling_wordlist.txt', 'names_wordlist.txt']
spelling_ignore_pypi_package_names = True

with open("index.rst") as f:
index_rst_lines = f.readlines()

Expand Down Expand Up @@ -200,16 +213,16 @@
# of the sidebar.
# html_logo = None

# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
# html_favicon = None

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']

# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
html_favicon = os.path.join(html_static_path[0], 'favicon.ico')

# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
# html_last_updated_fmt = '%b %d, %Y'
Expand Down
19 changes: 19 additions & 0 deletions doc/source/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,25 @@ the documentation are also built by Travis-CI. These docs are then hosted `here
<http://pandas-docs.github.io/pandas-docs-travis>`__, see also
the :ref:`Continuous Integration <contributing.ci>` section.

Spell checking documentation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When contributing to documentation to **pandas** it's good to check if your work
contains any spelling errors. Sphinx provides an easy way to spell check documentation
and docstrings.

Running the spell check is easy. Just navigate to your local ``pandas/doc/`` directory and run::

python make.py spellcheck

The spellcheck will take a few minutes to run (between 1 to 6 minutes). Sphinx will alert you
with warnings and misspelt words - these misspelt words will be added to a file called
``output.txt`` and you can find it on your local directory ``pandas/doc/build/spelling/``.

The Sphinx spelling extension uses an EN-US dictionary to correct words, what means that in
some cases you might need to add a word to this dictionary. You can do so by adding the word to
the bag-of-words file named ``spelling_wordlist.txt`` located in the folder ``pandas/doc/``.

.. _contributing.code:

Contributing to the code base
Expand Down
8 changes: 4 additions & 4 deletions doc/source/contributing_docstring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ left before or after the docstring. The text starts in the next line after the
opening quotes. The closing quotes have their own line
(meaning that they are not at the end of the last sentence).

In rare occasions reST styles like bold text or itallics will be used in
In rare occasions reST styles like bold text or italics will be used in
docstrings, but is it common to have inline code, which is presented between
backticks. It is considered inline code:

Expand Down Expand Up @@ -243,7 +243,7 @@ their use cases, if it is not too generic.
"""
Pivot a row index to columns.
When using a multi-index, a level can be pivoted so each value in
When using a MultiIndex, a level can be pivoted so each value in
the index becomes a column. This is especially useful when a subindex
is repeated for the main index, and data is easier to visualize as a
pivot table.
Expand Down Expand Up @@ -706,7 +706,7 @@ than 5, to show the example with the default values. If doing the ``mean``, we
could use something like ``[1, 2, 3]``, so it is easy to see that the value
returned is the mean.

For more complex examples (groupping for example), avoid using data without
For more complex examples (grouping for example), avoid using data without
interpretation, like a matrix of random numbers with columns A, B, C, D...
And instead use a meaningful example, which makes it easier to understand the
concept. Unless required by the example, use names of animals, to keep examples
Expand Down Expand Up @@ -877,7 +877,7 @@ be tricky. Here are some attention points:
the actual error only the error name is sufficient.

* If there is a small part of the result that can vary (e.g. a hash in an object
represenation), you can use ``...`` to represent this part.
representation), you can use ``...`` to represent this part.

If you want to show that ``s.plot()`` returns a matplotlib AxesSubplot object,
this will fail the doctest ::
Expand Down
22 changes: 11 additions & 11 deletions doc/source/cookbook.rst
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ New Columns
df = pd.DataFrame(
{'AAA' : [1,1,1,2,2,2,3,3], 'BBB' : [2,1,3,4,5,1,2,3]}); df
Method 1 : idxmin() to get the index of the mins
Method 1 : idxmin() to get the index of the minimums

.. ipython:: python
Expand All @@ -307,7 +307,7 @@ MultiIndexing

The :ref:`multindexing <advanced.hierarchical>` docs.

`Creating a multi-index from a labeled frame
`Creating a MultiIndex from a labeled frame
<http://stackoverflow.com/questions/14916358/reshaping-dataframes-in-pandas-based-on-column-labels>`__

.. ipython:: python
Expand All @@ -330,7 +330,7 @@ The :ref:`multindexing <advanced.hierarchical>` docs.
Arithmetic
**********

`Performing arithmetic with a multi-index that needs broadcasting
`Performing arithmetic with a MultiIndex that needs broadcasting
<http://stackoverflow.com/questions/19501510/divide-entire-pandas-multiindex-dataframe-by-dataframe-variable/19502176#19502176>`__

.. ipython:: python
Expand All @@ -342,7 +342,7 @@ Arithmetic
Slicing
*******

`Slicing a multi-index with xs
`Slicing a MultiIndex with xs
<http://stackoverflow.com/questions/12590131/how-to-slice-multindex-columns-in-pandas-dataframes>`__

.. ipython:: python
Expand All @@ -363,7 +363,7 @@ To take the cross section of the 1st level and 1st axis the index:
df.xs('six',level=1,axis=0)
`Slicing a multi-index with xs, method #2
`Slicing a MultiIndex with xs, method #2
<http://stackoverflow.com/questions/14964493/multiindex-based-indexing-in-pandas>`__

.. ipython:: python
Expand All @@ -386,13 +386,13 @@ To take the cross section of the 1st level and 1st axis the index:
df.loc[(All,'Math'),('Exams')]
df.loc[(All,'Math'),(All,'II')]
`Setting portions of a multi-index with xs
`Setting portions of a MultiIndex with xs
<http://stackoverflow.com/questions/19319432/pandas-selecting-a-lower-level-in-a-dataframe-to-do-a-ffill>`__

Sorting
*******

`Sort by specific column or an ordered list of columns, with a multi-index
`Sort by specific column or an ordered list of columns, with a MultiIndex
<http://stackoverflow.com/questions/14733871/mutli-index-sorting-in-pandas>`__

.. ipython:: python
Expand Down Expand Up @@ -664,7 +664,7 @@ The :ref:`Pivot <reshaping.pivot>` docs.
`Plot pandas DataFrame with year over year data
<http://stackoverflow.com/questions/30379789/plot-pandas-data-frame-with-year-over-year-data>`__

To create year and month crosstabulation:
To create year and month cross tabulation:

.. ipython:: python
Expand All @@ -677,7 +677,7 @@ To create year and month crosstabulation:
Apply
*****

`Rolling Apply to Organize - Turning embedded lists into a multi-index frame
`Rolling Apply to Organize - Turning embedded lists into a MultiIndex frame
<http://stackoverflow.com/questions/17349981/converting-pandas-dataframe-with-categorical-values-into-binary-values>`__

.. ipython:: python
Expand Down Expand Up @@ -1029,8 +1029,8 @@ Skip row between header and data
01.01.1990 05:00;21;11;12;13
"""
Option 1: pass rows explicitly to skiprows
""""""""""""""""""""""""""""""""""""""""""
Option 1: pass rows explicitly to skip rows
"""""""""""""""""""""""""""""""""""""""""""

.. ipython:: python
Expand Down
Loading

0 comments on commit 64b4bd0

Please sign in to comment.