Skip to content

Commit

Permalink
pythongh-85679: Recommend encoding="utf-8" in tutorial (pythonGH-91778
Browse files Browse the repository at this point in the history
)
  • Loading branch information
methane authored May 2, 2022
1 parent d414f7e commit 614420d
Showing 1 changed file with 18 additions and 10 deletions.
28 changes: 18 additions & 10 deletions Doc/tutorial/inputoutput.rst
Original file line number Diff line number Diff line change
Expand Up @@ -279,11 +279,12 @@ Reading and Writing Files
object: file

:func:`open` returns a :term:`file object`, and is most commonly used with
two arguments: ``open(filename, mode)``.
two positional arguments and one keyword argument:
``open(filename, mode, encoding=None)``

::

>>> f = open('workfile', 'w')
>>> f = open('workfile', 'w', encoding="utf-8")

.. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4>
Expand All @@ -300,11 +301,14 @@ writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
omitted.

Normally, files are opened in :dfn:`text mode`, that means, you read and write
strings from and to the file, which are encoded in a specific encoding. If
encoding is not specified, the default is platform dependent (see
:func:`open`). ``'b'`` appended to the mode opens the file in
:dfn:`binary mode`: now the data is read and written in the form of bytes
objects. This mode should be used for all files that don't contain text.
strings from and to the file, which are encoded in a specific *encoding*.
If *encoding* is not specified, the default is platform dependent
(see :func:`open`).
Because UTF-8 is the modern de-facto standard, ``encoding="utf-8"`` is
recommended unless you know that you need to use a different encoding.
Appending a ``'b'`` to the mode opens the file in :dfn:`binary mode`.
Binary mode data is read and written as :class:`bytes` objects.
You can not specify *encoding* when opening file in binary mode.

In text mode, the default when reading is to convert platform-specific line
endings (``\n`` on Unix, ``\r\n`` on Windows) to just ``\n``. When writing in
Expand All @@ -320,7 +324,7 @@ after its suite finishes, even if an exception is raised at some
point. Using :keyword:`!with` is also much shorter than writing
equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::

>>> with open('workfile') as f:
>>> with open('workfile', encoding="utf-8") as f:
... read_data = f.read()

>>> # We can check that the file has been automatically closed.
Expand Down Expand Up @@ -490,11 +494,15 @@ simply serializes the object to a :term:`text file`. So if ``f`` is a

json.dump(x, f)

To decode the object again, if ``f`` is a :term:`text file` object which has
been opened for reading::
To decode the object again, if ``f`` is a :term:`binary file` or
:term:`text file` object which has been opened for reading::

x = json.load(f)

.. note::
JSON files must be encoded in UTF-8. Use ``encoding="utf-8"`` when opening
JSON file as a :term:`text file` for both of reading and writing.

This simple serialization technique can handle lists and dictionaries, but
serializing arbitrary class instances in JSON requires a bit of extra effort.
The reference for the :mod:`json` module contains an explanation of this.
Expand Down

0 comments on commit 614420d

Please sign in to comment.