Issue #18840: Introduce the json module in the tutorial, and deemphasize the pickle module.

This commit is contained in:
Antoine Pitrou 2013-12-05 23:46:32 +01:00
parent 9c72ebc96b
commit dd799d2e32
3 changed files with 69 additions and 33 deletions

View file

@ -377,47 +377,64 @@ File objects have some additional methods, such as :meth:`~file.isatty` and
Reference for a complete guide to file objects.
.. _tut-pickle:
.. _tut-json:
The :mod:`pickle` Module
------------------------
Saving structured data with :mod:`json`
---------------------------------------
.. index:: module: pickle
.. index:: module: json
Strings can easily be written to and read from a file. Numbers take a bit more
Strings can easily be written to and read from a file. Numbers take a bit more
effort, since the :meth:`read` method only returns strings, which will have to
be passed to a function like :func:`int`, which takes a string like ``'123'``
and returns its numeric value 123. However, when you want to save more complex
data types like lists, dictionaries, or class instances, things get a lot more
complicated.
and returns its numeric value 123. When you want to save more complex data
types like nested lists and dictionaries, parsing and serializing by hand
becomes complicated.
Rather than have users be constantly writing and debugging code to save
complicated data types, Python provides a standard module called :mod:`pickle`.
This is an amazing module that can take almost any Python object (even some
forms of Python code!), and convert it to a string representation; this process
is called :dfn:`pickling`. Reconstructing the object from the string
representation is called :dfn:`unpickling`. Between pickling and unpickling,
the string representing the object may have been stored in a file or data, or
Rather than having users constantly writing and debugging code to save
complicated data types to files, Python allows you to use the popular data
interchange format called `JSON (JavaScript Object Notation)
<http://json.org>`_. The standard module called :mod:`json` can take Python
data hierarchies, and convert them to string representations; this process is
called :dfn:`serializing`. Reconstructing the data from the string representation
is called :dfn:`deserializing`. Between serializing and deserializing, the
string representing the object may have been stored in a file or data, or
sent over a network connection to some distant machine.
If you have an object ``x``, and a file object ``f`` that's been opened for
writing, the simplest way to pickle the object takes only one line of code::
.. note::
The JSON format is commonly used by modern applications to allow for data
exchange. Many programmers are already familiar with it, which makes
it a good choice for interoperability.
pickle.dump(x, f)
If you have an object ``x``, you can view its JSON string representation with a
simple line of code::
To unpickle the object again, if ``f`` is a file object which has been opened
for reading::
>>> json.dumps([1, 'simple', 'list'])
'[1, "simple", "list"]'
x = pickle.load(f)
Another variant of the :func:`~json.dumps` function, called :func:`~json.dump`,
simply serializes the object to a :term:`text file`. So if ``f`` is a
:term:`text file` object opened for writing, we can do this::
(There are other variants of this, used when pickling many objects or when you
don't want to write the pickled data to a file; consult the complete
documentation for :mod:`pickle` in the Python Library Reference.)
json.dump(x, f)
:mod:`pickle` is the standard way to make Python objects which can be stored and
reused by other programs or by a future invocation of the same program; the
technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
so widely used, many authors who write Python extensions take care to ensure
that new data types such as matrices can be properly pickled and unpickled.
To decode the object again, if ``f`` is a :term:`text file` object which has
been opened for reading::
x = json.load(f)
This simple serialization technique can handle lists and dictionaries, but
serializing arbitrary class instances in JSON requires a bit of extra effort.
The reference for the :mod:`json` module contains an explanation of this.
.. seealso::
:mod:`pickle` - the pickle module
Contrary to :ref:`JSON <tut-json>`, *pickle* is a protocol which allows
the serialization of arbitrarily complex Python objects. As such, it is
specific to Python and cannot be used to communicate with applications
written in other languages. It is also insecure by default:
deserializing pickle data coming from an untrusted source can execute
arbitrary code, if the data was crafted by a skilled attacker.