Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add discussion "Distribution package vs. import package" #1426

Merged
merged 14 commits into from
Dec 17, 2023
Merged
110 changes: 110 additions & 0 deletions source/discussions/distribution-package-vs-import-package.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
.. _distribution-package-vs-import-package:

=======================================
Distribution package vs. import package
=======================================

A number of different concepts are commonly referred to by the word
"package". This page clarifies the differences between two distinct but
related meanings in Python packaging, "distribution package" and "import
jeanas marked this conversation as resolved.
Show resolved Hide resolved
package".

What's a distribution package?
==============================

A distribution package is a piece of software that you can install.
jeanas marked this conversation as resolved.
Show resolved Hide resolved
Most of the time, this is synonymous with "project". When you type ``pip
install pkg``, or when you write ``dependencies = ["pkg"]`` in your
``pyproject.toml``, ``pkg`` is the name of a distribution package. When
you search or browse the PyPI_, the most widely known centralized source for
installing Python libraries and tools, what you see is a list of distribution
packages. Alternatively, the term "distribution package" can be used to
refer to a specific file that contains a certain version of a project.

Note that in the Linux world, a "distribution package",
most commonly abbreviated as "distro package" or just "package",
is something provided by the system package manager of the `Linux distribution <distro_>`_,
which is a different meaning.


What's an import package?
=========================

An import package is a Python module. Thus, when you write ``import
pkg`` or ``from pkg import func`` in your Python code, ``pkg`` is the
name of an import package. More precisely, import packages are special
Python modules that can contain submodules. For example, the ``numpy``
package contains modules like ``numpy.linalg`` and
``numpy.fft``. Usually, an import package is a directory on the file
system, containing modules as ``.py`` files and subpackages as
subdirectories.
webknjaz marked this conversation as resolved.
Show resolved Hide resolved

You can use an import package as soon as you have installed a distribution
package that provides it.


What are the links between distribution packages and import packages?
=====================================================================

Most of the time, a distribution package provides one single import
package (or non-package module), with a matching name. For example,
``pip install numpy`` lets you ``import numpy``.

However, this is only a convention. PyPI and other package indices *do not
enforce any relationship* between the name of a distribution package and the
import packages it provides. (A consequence of this is that you cannot blindly
install the PyPI package ``foo`` if you see ``import foo``; this may install an
unintended, and potentially even malicious package.)

A distribution package could provide an import package with a different
name. An example of this is the popular Pillow_ library for image
processing. Its distribution package name is ``Pillow``, but it provides
the import package ``PIL``. This is for historical reasons: Pillow
started as a fork of the PIL library, thus it kept the import name
``PIL`` so that existing PIL users could switch to Pillow with little
effort. More generally, a fork of an existing library is a common reason
for differing names between the distribution package and the import
package.

On a given package index (like PyPI), distribution package names must be
unique. On the other hand, import packages have no such requirement.
Import packages with the same name can be provided by several
jeanas marked this conversation as resolved.
Show resolved Hide resolved
distribution packages. Again, forks are a common reason for this.

Conversely, a distribution package can provide several import packages,
although this is less common. An example is the attrs_ distribution
package, which provides both an ``attrs`` import package with a newer
API, and an ``attr`` import package with an older but supported API.


How do distribution package names and import package names compare?
jeanas marked this conversation as resolved.
Show resolved Hide resolved
===================================================================

Import packages should have valid Python identifiers as their name (the
:ref:`exact rules <python:identifiers>` are found in the Python
documentation) [#non-identifier-mod-name]_. In particular, they use underscores ``_`` as word
separator and they are case-sensitive.

On the other hand, distribution packages can use hyphens ``-`` or
underscores ``_``. They can also contain dots ``.``, which is sometimes
used for packaging a subpackage of a :ref:`namespace package
<packaging-namespace-packages>`. For most purposes, they are insensitive
to case and to ``-`` vs. ``_`` differences, e.g., ``pip install
Awesome_Package`` is the same as ``pip install awesome-package`` (the
precise rules are given in the :ref:`name normalization specification
<name-normalization>`).


jeanas marked this conversation as resolved.
Show resolved Hide resolved
jeanas marked this conversation as resolved.
Show resolved Hide resolved

---------------------------

.. [#non-identifier-mod-name] Although it is technically possible
to import packages/modules that do not have a valid Python identifier as
their name, using :doc:`importlib <python:library/importlib>`,
this is vanishingly rare and strongly discouraged.


.. _distro: https://en.wikipedia.org/wiki/Linux_distribution
.. _PyPI: https://pypi.org
.. _Pillow: https://pypi.org/project/Pillow
.. _attrs: https://pypi.org/project/attrs
1 change: 1 addition & 0 deletions source/discussions/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,6 @@ specific topic. If you're just trying to get stuff done, see
pip-vs-easy-install
install-requires-vs-requirements
wheel-vs-egg
distribution-package-vs-import-package
src-layout-vs-flat-layout
setup-py-deprecated
6 changes: 4 additions & 2 deletions source/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,8 @@ Glossary
:term:`Import Package` (which is also commonly called a "package") or
another kind of distribution (e.g. a Linux distribution or the Python
language distribution), which are often referred to with the single term
"distribution".
"distribution". See :ref:`distribution-package-vs-import-package`
for a breakdown of the differences.

Egg

Expand Down Expand Up @@ -103,7 +104,8 @@ Glossary
An import package is more commonly referred to with the single word
"package", but this guide will use the expanded term when more clarity
is needed to prevent confusion with a :term:`Distribution Package` which
is also commonly called a "package".
is also commonly called a "package". See :ref:`distribution-package-vs-import-package`
for a breakdown of the differences.

Module

Expand Down
2 changes: 2 additions & 0 deletions source/guides/packaging-namespace-packages.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _packaging-namespace-packages:

============================
Packaging namespace packages
============================
Expand Down