Skip to content

Commit

Permalink
ENH: add BooleanArray extension array (pandas-dev#29555)
Browse files Browse the repository at this point in the history
  • Loading branch information
jorisvandenbossche authored and proost committed Dec 19, 2019
1 parent 7e50219 commit abaef60
Show file tree
Hide file tree
Showing 15 changed files with 1,668 additions and 1 deletion.
1 change: 1 addition & 0 deletions doc/source/getting_started/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1950,6 +1950,7 @@ sparse :class:`SparseDtype` (none) :class:`arrays.
intervals :class:`IntervalDtype` :class:`Interval` :class:`arrays.IntervalArray` :ref:`advanced.intervalindex`
nullable integer :class:`Int64Dtype`, ... (none) :class:`arrays.IntegerArray` :ref:`integer_na`
Strings :class:`StringDtype` :class:`str` :class:`arrays.StringArray` :ref:`text`
Boolean (with NA) :class:`BooleanDtype` :class:`bool` :class:`arrays.BooleanArray` :ref:`api.arrays.bool`
=================== ========================= ================== ============================= =============================

Pandas has two ways to store strings.
Expand Down
23 changes: 23 additions & 0 deletions doc/source/reference/arrays.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Nullable Integer :class:`Int64Dtype`, ... (none) :ref:`api.array
Categorical :class:`CategoricalDtype` (none) :ref:`api.arrays.categorical`
Sparse :class:`SparseDtype` (none) :ref:`api.arrays.sparse`
Strings :class:`StringDtype` :class:`str` :ref:`api.arrays.string`
Boolean (with NA) :class:`BooleanDtype` :class:`bool` :ref:`api.arrays.bool`
=================== ========================= ================== =============================

Pandas and third-party libraries can extend NumPy's type system (see :ref:`extending.extension-types`).
Expand Down Expand Up @@ -485,6 +486,28 @@ The ``Series.str`` accessor is available for ``Series`` backed by a :class:`arra
See :ref:`api.series.str` for more.


.. _api.arrays.bool:

Boolean data with missing values
--------------------------------

The boolean dtype (with the alias ``"boolean"``) provides support for storing
boolean data (True, False values) with missing values, which is not possible
with a bool :class:`numpy.ndarray`.

.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

arrays.BooleanArray

.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

BooleanDtype


.. Dtype attributes which are manually listed in their docstrings: including
.. it here to make sure a docstring page is built for them
Expand Down
24 changes: 24 additions & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,30 @@ String accessor methods returning integers will return a value with :class:`Int6
We recommend explicitly using the ``string`` data type when working with strings.
See :ref:`text.types` for more.

.. _whatsnew_100.boolean:

Boolean data type with missing values support
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

We've added :class:`BooleanDtype` / :class:`~arrays.BooleanArray`, an extension
type dedicated to boolean data that can hold missing values. With the default
``'bool`` data type based on a numpy bool array, the column can only hold
True or False values and not missing values. This new :class:`BooleanDtype`
can store missing values as well by keeping track of this in a separate mask.
(:issue:`29555`)

.. ipython:: python
pd.Series([True, False, None], dtype=pd.BooleanDtype())
You can use the alias ``"boolean"`` as well.

.. ipython:: python
s = pd.Series([True, False, None], dtype="boolean")
s
.. _whatsnew_1000.enhancements.other:

Other enhancements
Expand Down
1 change: 1 addition & 0 deletions pandas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@
IntervalDtype,
DatetimeTZDtype,
StringDtype,
BooleanDtype,
# missing
isna,
isnull,
Expand Down
2 changes: 2 additions & 0 deletions pandas/arrays/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
See :ref:`extending.extension-types` for more.
"""
from pandas.core.arrays import (
BooleanArray,
Categorical,
DatetimeArray,
IntegerArray,
Expand All @@ -16,6 +17,7 @@
)

__all__ = [
"BooleanArray",
"Categorical",
"DatetimeArray",
"IntegerArray",
Expand Down
14 changes: 14 additions & 0 deletions pandas/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,20 @@ def compare_operators_no_eq_ne(request):
return request.param


@pytest.fixture(
params=["__and__", "__rand__", "__or__", "__ror__", "__xor__", "__rxor__"]
)
def all_logical_operators(request):
"""
Fixture for dunder names for common logical operations
* |
* &
* ^
"""
return request.param


@pytest.fixture(params=[None, "gzip", "bz2", "zip", "xz"])
def compression(request):
"""
Expand Down
1 change: 1 addition & 0 deletions pandas/core/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

from pandas.core.algorithms import factorize, unique, value_counts
from pandas.core.arrays import Categorical
from pandas.core.arrays.boolean import BooleanDtype
from pandas.core.arrays.integer import (
Int8Dtype,
Int16Dtype,
Expand Down
1 change: 1 addition & 0 deletions pandas/core/arrays/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
ExtensionScalarOpsMixin,
try_cast_to_ea,
)
from .boolean import BooleanArray # noqa: F401
from .categorical import Categorical # noqa: F401
from .datetimes import DatetimeArray # noqa: F401
from .integer import IntegerArray, integer_array # noqa: F401
Expand Down
9 changes: 9 additions & 0 deletions pandas/core/arrays/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1088,6 +1088,15 @@ def _add_comparison_ops(cls):
cls.__le__ = cls._create_comparison_method(operator.le)
cls.__ge__ = cls._create_comparison_method(operator.ge)

@classmethod
def _add_logical_ops(cls):
cls.__and__ = cls._create_logical_method(operator.and_)
cls.__rand__ = cls._create_logical_method(ops.rand_)
cls.__or__ = cls._create_logical_method(operator.or_)
cls.__ror__ = cls._create_logical_method(ops.ror_)
cls.__xor__ = cls._create_logical_method(operator.xor)
cls.__rxor__ = cls._create_logical_method(ops.rxor)


class ExtensionScalarOpsMixin(ExtensionOpsMixin):
"""
Expand Down
Loading

0 comments on commit abaef60

Please sign in to comment.