Merge remote-tracking branch 'upstream/master' into bug/categorical-i…

…ndexing-1row-df * upstream/master: (109 commits) stronger typing in libreduction (pandas-dev#29502) API: rename labels to codes (pandas-dev#29509) CLN: remove unnecessary type checks (pandas-dev#29517) implement _BaseGrouper (pandas-dev#29520) CLN: F-string formatting in pandas/_libs/*.pyx (pandas-dev#29527) Fixed more SS03 errors (pandas-dev#29540) consolidate dim checks (pandas-dev#29536) REF: separate out _get_cython_func_and_vals (pandas-dev#29537) remove unnecessary exception (pandas-dev#29538) TST:Add test to check single category col returns series with single row slice (pandas-dev#29521) Make color validation more forgiving (pandas-dev#29122) DOC: update bottleneck repo and documentation urls (pandas-dev#29516) TST: add test for df construction from dict with tuples (pandas-dev#29497) add test for pd.melt dtypes preservation (pandas-dev#29510) updated DataFrame.equals docstring (pandas-dev#29496) Resolved merge conflicts (pandas-dev#29506) DOC: Improved pandas/compact/__init__.py (pandas-dev#29507) DOC: Update performance comparison section of io docs (pandas-dev#28890) TST: add test for df.where() with category dtype (pandas-dev#29454) DOC: Fix docs on merging categoricals. (pandas-dev#28185) ...
keechongtan · Nov 11, 2019 · 3e847e9 · 3e847e9
2 parents 2dfa594 + 07efdd4
commit 3e847e9
Show file tree

Hide file tree

Showing 206 changed files with 4,540 additions and 4,113 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -32,7 +32,7 @@ matrix:
     include:
     - dist: bionic
       # 18.04
-      python: 3.8-dev
+      python: 3.8.0
       env:
         - JOB="3.8-dev" PATTERN="(not slow and not network)"
 

diff --git a/README.md b/README.md
@@ -190,7 +190,7 @@ or for installing in [development mode](https://pip.pypa.io/en/latest/reference/
 
 
 ```sh
-python -m pip install --no-build-isolation -e .
+python -m pip install -e . --no-build-isolation --no-use-pep517
 ```
 
 If you have `make`, you can also use `make develop` to run the same command.

diff --git a/ci/azure/posix.yml b/ci/azure/posix.yml
@@ -9,17 +9,16 @@ jobs:
   strategy:
     matrix:
       ${{ if eq(parameters.name, 'macOS') }}:
-        py35_macos:
-          ENV_FILE: ci/deps/azure-macos-35.yaml
-          CONDA_PY: "35"
+        py36_macos:
+          ENV_FILE: ci/deps/azure-macos-36.yaml
+          CONDA_PY: "36"
           PATTERN: "not slow and not network"
 
       ${{ if eq(parameters.name, 'Linux') }}:
-        py35_compat:
-          ENV_FILE: ci/deps/azure-35-compat.yaml
-          CONDA_PY: "35"
+        py36_minimum_versions:
+          ENV_FILE: ci/deps/azure-36-minimum_versions.yaml
+          CONDA_PY: "36"
           PATTERN: "not slow and not network"
-
         py36_locale_slow_old_np:
           ENV_FILE: ci/deps/azure-36-locale.yaml
           CONDA_PY: "36"
@@ -45,13 +44,16 @@ jobs:
           PATTERN: "not slow and not network"
           LOCALE_OVERRIDE: "zh_CN.UTF-8"
 
-        py37_np_dev:
-          ENV_FILE: ci/deps/azure-37-numpydev.yaml
-          CONDA_PY: "37"
-          PATTERN: "not slow and not network"
-          TEST_ARGS: "-W error"
-          PANDAS_TESTING_MODE: "deprecate"
-          EXTRA_APT: "xsel"
+        # https://github.com/pandas-dev/pandas/issues/29432
+        # py37_np_dev:
+        #   ENV_FILE: ci/deps/azure-37-numpydev.yaml
+        #   CONDA_PY: "37"
+        #   PATTERN: "not slow and not network"
+        #   TEST_ARGS: "-W error"
+        #   PANDAS_TESTING_MODE: "deprecate"
+        #   EXTRA_APT: "xsel"
+        #   # TODO:
+        #   continueOnError: true
 
   steps:
     - script: |

diff --git a/ci/deps/azure-35-compat.yaml → ci/deps/azure-36-minimum_versions.yaml b/ci/deps/azure-35-compat.yaml → ci/deps/azure-36-minimum_versions.yaml
@@ -5,26 +5,23 @@ channels:
 dependencies:
   - beautifulsoup4=4.6.0
   - bottleneck=1.2.1
+  - cython>=0.29.13
   - jinja2=2.8
   - numexpr=2.6.2
   - numpy=1.13.3
   - openpyxl=2.4.8
   - pytables=3.4.2
   - python-dateutil=2.6.1
-  - python=3.5.3
+  - python=3.6.1
   - pytz=2017.2
   - scipy=0.19.0
   - xlrd=1.1.0
   - xlsxwriter=0.9.8
   - xlwt=1.2.0
   # universal
+  - html5lib=1.0.1
   - hypothesis>=3.58.0
+  - pytest=4.5.0
   - pytest-xdist
   - pytest-mock
   - pytest-azurepipelines
-  - pip
-  - pip:
-    # for python 3.5, pytest>=4.0.2, cython>=0.29.13 is not available in conda
-    - cython>=0.29.13
-    - pytest==4.5.0
-    - html5lib==1.0b2
diff --git a/ci/deps/azure-macos-35.yaml → ci/deps/azure-macos-36.yaml b/ci/deps/azure-macos-35.yaml → ci/deps/azure-macos-36.yaml
@@ -14,7 +14,7 @@ dependencies:
   - openpyxl
   - pyarrow
   - pytables
-  - python=3.5.*
+  - python=3.6.*
   - python-dateutil==2.6.1
   - pytz
   - xarray

diff --git a/doc/source/development/contributing.rst b/doc/source/development/contributing.rst
@@ -208,7 +208,7 @@ We'll now kick off a three-step process:
 
    # Build and install pandas
    python setup.py build_ext --inplace -j 4
-   python -m pip install -e . --no-build-isolation
+   python -m pip install -e . --no-build-isolation --no-use-pep517
 
 At this point you should be able to import pandas from your locally built version::
 
@@ -236,7 +236,7 @@ Creating a Python environment (pip)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 If you aren't using conda for your development environment, follow these instructions.
-You'll need to have at least python3.5 installed on your system.
+You'll need to have at least Python 3.6.1 installed on your system.
 
 **Unix**/**Mac OS**
 
@@ -255,7 +255,7 @@ You'll need to have at least python3.5 installed on your system.
 
    # Build and install pandas
    python setup.py build_ext --inplace -j 0
-   python -m pip install -e . --no-build-isolation
+   python -m pip install -e . --no-build-isolation --no-use-pep517
 
 **Windows**
 
@@ -847,29 +847,6 @@ The limitation here is that while a human can reasonably understand that ``is_nu
 
 With custom types and inference this is not always possible so exceptions are made, but every effort should be exhausted to avoid ``cast`` before going down such paths.
 
-Syntax Requirements
-~~~~~~~~~~~~~~~~~~~
-
-Because *pandas* still supports Python 3.5, :pep:`526` does not apply and variables **must** be annotated with type comments. Specifically, this is a valid annotation within pandas:
-
-.. code-block:: python
-
-   primes = []  # type: List[int]
-
-Whereas this is **NOT** allowed:
-
-.. code-block:: python
-
-   primes: List[int] = []  # not supported in Python 3.5!
-
-Note that function signatures can always be annotated per :pep:`3107`:
-
-.. code-block:: python
-
-   def sum_of_primes(primes: List[int] = []) -> int:
-       ...
-
-
 Pandas-specific Types
 ~~~~~~~~~~~~~~~~~~~~~
 
@@ -1296,7 +1273,7 @@ environment by::
 
 or, to use a specific Python interpreter,::
 
-    asv run -e -E existing:python3.5
+    asv run -e -E existing:python3.6
 
 This will display stderr from the benchmarks, and use your local
 ``python`` that comes from your ``$PATH``.

diff --git a/doc/source/development/policies.rst b/doc/source/development/policies.rst
@@ -51,7 +51,7 @@ Pandas may change the behavior of experimental features at any time.
 Python Support
 ~~~~~~~~~~~~~~
 
-Pandas will only drop support for specific Python versions (e.g. 3.5.x, 3.6.x) in
+Pandas will only drop support for specific Python versions (e.g. 3.6.x, 3.7.x) in
 pandas **major** releases.
 
 .. _SemVer: https://semver.org
diff --git a/doc/source/getting_started/basics.rst b/doc/source/getting_started/basics.rst
@@ -753,28 +753,51 @@ on an entire ``DataFrame`` or ``Series``, row- or column-wise, or elementwise.
 Tablewise function application
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-``DataFrames`` and ``Series`` can of course just be passed into functions.
+``DataFrames`` and ``Series`` can be passed into functions.
 However, if the function needs to be called in a chain, consider using the :meth:`~DataFrame.pipe` method.
-Compare the following
 
-.. code-block:: python
+First some setup:
+
+.. ipython:: python
 
-   # f, g, and h are functions taking and returning ``DataFrames``
-   >>> f(g(h(df), arg1=1), arg2=2, arg3=3)
+    def extract_city_name(df):
+        """
+        Chicago, IL -> Chicago for city_name column
+        """
+        df['city_name'] = df['city_and_code'].str.split(",").str.get(0)
+        return df
 
-with the equivalent
+    def add_country_name(df, country_name=None):
+        """
+        Chicago -> Chicago-US for city_name column
+        """
+        col = 'city_name'
+        df['city_and_country'] = df[col] + country_name
+        return df
 
-.. code-block:: python
+    df_p = pd.DataFrame({'city_and_code': ['Chicago, IL']})
+
+
+``extract_city_name`` and ``add_country_name`` are functions taking and returning ``DataFrames``.
+
+Now compare the following:
+
+.. ipython:: python
+
+    add_country_name(extract_city_name(df_p), country_name='US')
+
+Is equivalent to:
+
+.. ipython:: python
 
-   >>> (df.pipe(h)
-   ...    .pipe(g, arg1=1)
-   ...    .pipe(f, arg2=2, arg3=3))
+    (df_p.pipe(extract_city_name)
+         .pipe(add_country_name, country_name="US"))
 
 Pandas encourages the second style, which is known as method chaining.
 ``pipe`` makes it easy to use your own or another library's functions
 in method chains, alongside pandas' methods.
 
-In the example above, the functions ``f``, ``g``, and ``h`` each expected the ``DataFrame`` as the first positional argument.
+In the example above, the functions ``extract_city_name`` and ``add_country_name`` each expected a ``DataFrame`` as the first positional argument.
 What if the function you wish to apply takes its data as, say, the second argument?
 In this case, provide ``pipe`` with a tuple of ``(callable, data_keyword)``.
 ``.pipe`` will route the ``DataFrame`` to the argument specified in the tuple.

diff --git a/doc/source/getting_started/dsintro.rst b/doc/source/getting_started/dsintro.rst
@@ -564,53 +564,6 @@ to a column created earlier in the same :meth:`~DataFrame.assign`.
 In the second expression, ``x['C']`` will refer to the newly created column,
 that's equal to ``dfa['A'] + dfa['B']``.
 
-To write code compatible with all versions of Python, split the assignment in two.
-
-.. ipython:: python
-
-   dependent = pd.DataFrame({"A": [1, 1, 1]})
-   (dependent.assign(A=lambda x: x['A'] + 1)
-             .assign(B=lambda x: x['A'] + 2))
-
-.. warning::
-
-   Dependent assignment may subtly change the behavior of your code between
-   Python 3.6 and older versions of Python.
-
-   If you wish to write code that supports versions of python before and after 3.6,
-   you'll need to take care when passing ``assign`` expressions that
-
-   * Update an existing column
-   * Refer to the newly updated column in the same ``assign``
-
-   For example, we'll update column "A" and then refer to it when creating "B".
-
-   .. code-block:: python
-
-      >>> dependent = pd.DataFrame({"A": [1, 1, 1]})
-      >>> dependent.assign(A=lambda x: x["A"] + 1, B=lambda x: x["A"] + 2)
-
-   For Python 3.5 and earlier the expression creating ``B`` refers to the
-   "old" value of ``A``, ``[1, 1, 1]``. The output is then
-
-   .. code-block:: console
-
-         A  B
-      0  2  3
-      1  2  3
-      2  2  3
-
-   For Python 3.6 and later, the expression creating ``A`` refers to the
-   "new" value of ``A``, ``[2, 2, 2]``, which results in
-
-   .. code-block:: console
-
-         A  B
-      0  2  4
-      1  2  4
-      2  2  4
-
-
 
 Indexing / selection
 ~~~~~~~~~~~~~~~~~~~~

diff --git a/doc/source/getting_started/install.rst b/doc/source/getting_started/install.rst
@@ -18,7 +18,7 @@ Instructions for installing from source,
 Python version support
 ----------------------
 
-Officially Python 3.5.3 and above, 3.6, 3.7, and 3.8.
+Officially Python 3.6.1 and above, 3.7, and 3.8.
 
 Installing pandas
 -----------------
@@ -140,7 +140,7 @@ Installing with ActivePython
 Installation instructions for
 `ActivePython <https://www.activestate.com/activepython>`__ can be found
 `here <https://www.activestate.com/activepython/downloads>`__. Versions
-2.7 and 3.5 include pandas.
+2.7, 3.5 and 3.6 include pandas.
 
 Installing using your Linux distribution's package manager.
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -218,7 +218,7 @@ Recommended dependencies
   ``numexpr`` uses multiple cores as well as smart chunking and caching to achieve large speedups.
   If installed, must be Version 2.6.2 or higher.
 
-* `bottleneck <https://github.com/kwgoodman/bottleneck>`__: for accelerating certain types of ``nan``
+* `bottleneck <https://github.com/pydata/bottleneck>`__: for accelerating certain types of ``nan``
   evaluations. ``bottleneck`` uses specialized cython routines to achieve large speedups. If installed,
   must be Version 1.2.1 or higher.
 

diff --git a/doc/source/reference/indexing.rst b/doc/source/reference/indexing.rst
@@ -93,7 +93,6 @@ Compatibility with MultiIndex
    :toctree: api/
 
    Index.set_names
-   Index.is_lexsorted_for_tuple
    Index.droplevel
 
 Missing values

diff --git a/doc/source/reference/window.rst b/doc/source/reference/window.rst
@@ -34,6 +34,8 @@ Standard moving window functions
    Rolling.quantile
    Window.mean
    Window.sum
+   Window.var
+   Window.std
 
 .. _api.functions_expanding: