mirror of
https://github.com/python/cpython.git
synced 2025-09-28 19:25:27 +00:00
Another checkpoint -- some stuff I managed to do on the train.
This commit is contained in:
parent
d74a1dc501
commit
3828768f45
1 changed files with 92 additions and 79 deletions
|
@ -129,25 +129,29 @@ Note:
|
||||||
:func:`print` function calls, so this is mostly a non-issue for
|
:func:`print` function calls, so this is mostly a non-issue for
|
||||||
larger projects.
|
larger projects.
|
||||||
|
|
||||||
Text Strings Vs. Bytes
|
Text Vs. Data Instead Of Unicode Vs. 8-bit
|
||||||
----------------------
|
------------------------------------------
|
||||||
|
|
||||||
Everything you thought you knew about binary data and Unicode has
|
Everything you thought you knew about binary data and Unicode has
|
||||||
changed. There's a longer section below; here's a summary of the
|
changed:
|
||||||
changes:
|
|
||||||
|
|
||||||
* Python 3.0 uses *strings* and *bytes* instead of *Unicode strings*
|
XXX HIRO
|
||||||
and *8-bit strings*. The difference is that any attempt to mix
|
|
||||||
strings and bytes in Python 3.0 raises a TypeError exception,
|
|
||||||
whereas if you were to mix Unicode and 8-bit strings in Python 2.x,
|
|
||||||
you would only get an exception if the 8-bit string contained
|
|
||||||
non-ASCII values. As a consequence, pretty much all code that
|
|
||||||
uses Unicode, encodings or binary data most likely has to change.
|
|
||||||
The change is for the better, as in the 2.x world there were
|
|
||||||
numerous bugs having to do with mixing encoded and unencoded text.
|
|
||||||
|
|
||||||
* You no longer need to use ``u"..."`` literals for Unicode text.
|
* Python 3.0 uses the concepts of *text* and (binary) *data* instead
|
||||||
However, you must use ``b"..."`` literals for binary data.
|
of Unicode strings and 8-bit strings. All text is Unicode; however
|
||||||
|
*encoded* Unicode is represented as binary data. The type used to
|
||||||
|
hold text is :class:`str`, the type used to hold data is
|
||||||
|
:class:`bytes`. The difference is that any attempt to mix text and
|
||||||
|
data in Python 3.0 raises a TypeError exception, whereas if you were
|
||||||
|
to mix Unicode and 8-bit strings in Python 2.x, you would only get
|
||||||
|
an exception if the 8-bit string contained non-ASCII values. As a
|
||||||
|
consequence, pretty much all code that uses Unicode, encodings or
|
||||||
|
binary data most likely has to change. The change is for the
|
||||||
|
better, as in the 2.x world there were numerous bugs having to do
|
||||||
|
with mixing encoded and unencoded text.
|
||||||
|
|
||||||
|
* You no longer use ``u"..."`` literals for Unicode text. However,
|
||||||
|
you must use ``b"..."`` literals for binary data.
|
||||||
|
|
||||||
* Files opened as text files (still the default mode for :func:`open`)
|
* Files opened as text files (still the default mode for :func:`open`)
|
||||||
always use an encoding to map between strings (in memory) and bytes
|
always use an encoding to map between strings (in memory) and bytes
|
||||||
|
@ -167,6 +171,50 @@ changes:
|
||||||
don't have functionality enough in common to warrant a shared base
|
don't have functionality enough in common to warrant a shared base
|
||||||
class.
|
class.
|
||||||
|
|
||||||
|
* All backslashes in raw strings are interpreted literally. This
|
||||||
|
means that ``'\U'`` and ``'\u'`` escapes in raw strings are not
|
||||||
|
treated specially.
|
||||||
|
|
||||||
|
XXX Deal with dupes below
|
||||||
|
|
||||||
|
* There is only one text string type; its name is :class:`str` but its
|
||||||
|
behavior and implementation are like :class:`unicode` in 2.x.
|
||||||
|
|
||||||
|
* The :class:`basestring` superclass has been removed. The ``2to3``
|
||||||
|
tool (see below) replaces every occurrence of :class:`basestring`
|
||||||
|
with :class:`str`.
|
||||||
|
|
||||||
|
* :pep:`3137`: There is a new type, :class:`bytes`, to represent
|
||||||
|
binary data (and encoded text, which is treated as binary data until
|
||||||
|
it is decoded). The :class:`str` and :class:`bytes` types cannot be
|
||||||
|
mixed; you must always explicitly convert between them, using the
|
||||||
|
:meth:`str.encode` (str -> bytes) or :meth:`bytes.decode` (bytes ->
|
||||||
|
str) methods.
|
||||||
|
|
||||||
|
* Like :class:`str`, the :class:`bytes` type is immutable. There is a
|
||||||
|
separate *mutable* type to hold buffered binary data,
|
||||||
|
:class:`bytearray`. Nearly all APIs that accept :class:`bytes` also
|
||||||
|
accept :class:`bytearray`. The mutable API is based on
|
||||||
|
:class:`collections.MutableSequence`.
|
||||||
|
|
||||||
|
* :pep:`3138`: The :func:`repr` of a string no longer escapes
|
||||||
|
non-ASCII characters. It still escapes control characters and code
|
||||||
|
points with non-printable status in the Unicode standard, however.
|
||||||
|
|
||||||
|
* :pep:`3120`: The default source encoding is now UTF-8.
|
||||||
|
|
||||||
|
* :pep:`3131`: Non-ASCII letters are now allowed in identifiers.
|
||||||
|
(However, the standard library remains ASCII-only with the exception
|
||||||
|
of contributor names in comments.)
|
||||||
|
|
||||||
|
* :pep:`3116`: New I/O implementation. The API is nearly 100%
|
||||||
|
backwards compatible, but completely reimplemented (currently largely
|
||||||
|
in Python). Also, binary files use bytes instead of strings.
|
||||||
|
|
||||||
|
* The :mod:`StringIO` and :mod:`cStringIO` modules are gone. Instead,
|
||||||
|
import :class:`io.StringIO` or :class:`io.BytesIO`, for text and
|
||||||
|
data respectively.
|
||||||
|
|
||||||
* See also the :ref:`unicode-howto`, which was updated for Python 3.0.
|
* See also the :ref:`unicode-howto`, which was updated for Python 3.0.
|
||||||
|
|
||||||
Views And Iterators Instead Of Lists
|
Views And Iterators Instead Of Lists
|
||||||
|
@ -254,8 +302,8 @@ Overview Of Syntax Changes
|
||||||
This section gives a brief overview of every *syntactic* change in
|
This section gives a brief overview of every *syntactic* change in
|
||||||
Python 3.0.
|
Python 3.0.
|
||||||
|
|
||||||
Additions
|
New Syntax
|
||||||
---------
|
----------
|
||||||
|
|
||||||
* :pep:`3107`: Function argument and return value annotations. This
|
* :pep:`3107`: Function argument and return value annotations. This
|
||||||
provides a standardized way of annotating a function's parameters
|
provides a standardized way of annotating a function's parameters
|
||||||
|
@ -304,8 +352,8 @@ Additions
|
||||||
|
|
||||||
* Bytes literals are introduced with a leading ``b`` or ``B``.
|
* Bytes literals are introduced with a leading ``b`` or ``B``.
|
||||||
|
|
||||||
Changes
|
Changed Syntax
|
||||||
-------
|
--------------
|
||||||
|
|
||||||
* New :keyword:`raise` statement syntax: ``raise [expr [from expr]]``.
|
* New :keyword:`raise` statement syntax: ``raise [expr [from expr]]``.
|
||||||
Also note that string exceptions are no longer legal (:pep:`0352`).
|
Also note that string exceptions are no longer legal (:pep:`0352`).
|
||||||
|
@ -333,8 +381,8 @@ Changes
|
||||||
*must* now be spelled as ``...``. (Previously it could also be
|
*must* now be spelled as ``...``. (Previously it could also be
|
||||||
spelled as ``. . .``, by a mere accident of the grammar.)
|
spelled as ``. . .``, by a mere accident of the grammar.)
|
||||||
|
|
||||||
Removals
|
Removed Syntax
|
||||||
--------
|
--------------
|
||||||
|
|
||||||
* :pep:`3113`: Tuple parameter unpacking removed. You can no longer
|
* :pep:`3113`: Tuple parameter unpacking removed. You can no longer
|
||||||
write ``def foo(a, (b, c)): ...``.
|
write ``def foo(a, (b, c)): ...``.
|
||||||
|
@ -362,7 +410,6 @@ Removals
|
||||||
(:pep:`0328`)
|
(:pep:`0328`)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Changes Already Present In Python 2.6
|
Changes Already Present In Python 2.6
|
||||||
=====================================
|
=====================================
|
||||||
|
|
||||||
|
@ -401,8 +448,7 @@ consulted for longer descriptions.
|
||||||
|
|
||||||
* :ref:`pep-3112`. The ``b"..."`` string literal notation (and its
|
* :ref:`pep-3112`. The ``b"..."`` string literal notation (and its
|
||||||
variants like ``b'...'``, ``b"""..."""``, and ``br"..."``) now
|
variants like ``b'...'``, ``b"""..."""``, and ``br"..."``) now
|
||||||
produces a literal of type :class:`bytes`. More about
|
produces a literal of type :class:`bytes`.
|
||||||
:class:`bytes` below.
|
|
||||||
|
|
||||||
* :ref:`pep-3116`. The :mod:`io` module is now the standard way of
|
* :ref:`pep-3116`. The :mod:`io` module is now the standard way of
|
||||||
doing file I/O, and the initial values of :data:`sys.stdin`,
|
doing file I/O, and the initial values of :data:`sys.stdin`,
|
||||||
|
@ -411,14 +457,17 @@ consulted for longer descriptions.
|
||||||
alias for :func:`io.open` and has additional keyword arguments
|
alias for :func:`io.open` and has additional keyword arguments
|
||||||
*encoding*, *errors*, *newline* and *closefd*. Also note that an
|
*encoding*, *errors*, *newline* and *closefd*. Also note that an
|
||||||
invalid *mode* argument now raises :exc:`ValueError`, not
|
invalid *mode* argument now raises :exc:`ValueError`, not
|
||||||
:exc:`IOError`.
|
:exc:`IOError`. The binary file object underlying a text file
|
||||||
|
object can be accessed as :attr:`f.buffer` (but beware that the
|
||||||
|
text object maintains a buffer of itself in order to speed up
|
||||||
|
the encoding and decoding operations).
|
||||||
|
|
||||||
* :ref:`pep-3118`. The old builtin :func:`buffer` is now really gone;
|
* :ref:`pep-3118`. The old builtin :func:`buffer` is now really gone;
|
||||||
the new builtin :func:`memoryview` provides (mostly) similar
|
the new builtin :func:`memoryview` provides (mostly) similar
|
||||||
functionality.
|
functionality.
|
||||||
|
|
||||||
* :ref:`pep-3119`. The :mod:`abc` module and the ABCs defined in the
|
* :ref:`pep-3119`. The :mod:`abc` module and the ABCs defined in the
|
||||||
:mod:`collections` module plays a slightly more prominent role in
|
:mod:`collections` module plays a somewhat more prominent role in
|
||||||
the language now, and builtin collection types like :class:`dict`
|
the language now, and builtin collection types like :class:`dict`
|
||||||
and :class:`list` conform to the :class:`collections.MutableMapping`
|
and :class:`list` conform to the :class:`collections.MutableMapping`
|
||||||
and :class:`collections.MutableSequence` ABC, respectively.
|
and :class:`collections.MutableSequence` ABC, respectively.
|
||||||
|
@ -427,11 +476,11 @@ consulted for longer descriptions.
|
||||||
notation is the only one supported, and binary literals have been
|
notation is the only one supported, and binary literals have been
|
||||||
added.
|
added.
|
||||||
|
|
||||||
* :ref:`pep-3129`. This speaks for itself.
|
* :ref:`pep-3129`.
|
||||||
|
|
||||||
* :ref:`pep-3141`. The :mod:`numbers` module is another new use of
|
* :ref:`pep-3141`. The :mod:`numbers` module is another new use of
|
||||||
ABCs, defining Python's "numeric tower". Also note the new
|
ABCs, defining Python's "numeric tower". Also note the new
|
||||||
:mod:`fractions` module.
|
:mod:`fractions` module which implements :class:`numbers.Rational`.
|
||||||
|
|
||||||
|
|
||||||
Library Changes
|
Library Changes
|
||||||
|
@ -532,58 +581,14 @@ Some other library changes (not covered by :pep:`3108`):
|
||||||
* Cleanup of the :mod:`random` module: removed the :func:`jumpahead` API.
|
* Cleanup of the :mod:`random` module: removed the :func:`jumpahead` API.
|
||||||
|
|
||||||
|
|
||||||
Strings And Bytes
|
|
||||||
=================
|
|
||||||
|
|
||||||
This section discusses the many changes in string XXX
|
|
||||||
|
|
||||||
* There is only one string type; its name is :class:`str` but its behavior and
|
|
||||||
implementation are like :class:`unicode` in 2.x.
|
|
||||||
|
|
||||||
* The :class:`basestring` superclass has been removed. The ``2to3`` tool
|
|
||||||
replaces every occurrence of :class:`basestring` with :class:`str`.
|
|
||||||
|
|
||||||
* :pep:`3137`: There is a new type, :class:`bytes`, to represent
|
|
||||||
binary data (and encoded text, which is treated as binary data until
|
|
||||||
you decide to decode it). The :class:`str` and :class:`bytes` types
|
|
||||||
cannot be mixed; you must always explicitly convert between them,
|
|
||||||
using the :meth:`str.encode` (str -> bytes) or :meth:`bytes.decode`
|
|
||||||
(bytes -> str) methods.
|
|
||||||
|
|
||||||
.. XXX add bytearray
|
|
||||||
|
|
||||||
* All backslashes in raw strings are interpreted literally. This means that
|
|
||||||
``'\U'`` and ``'\u'`` escapes in raw strings are not treated specially.
|
|
||||||
|
|
||||||
* :pep:`3138`: :func:`repr` of a string no longer escapes all
|
|
||||||
non-ASCII characters. XXX
|
|
||||||
|
|
||||||
* :pep:`3112`: Bytes literals, e.g. ``b"abc"``, create :class:`bytes`
|
|
||||||
instances.
|
|
||||||
|
|
||||||
* :pep:`3120`: UTF-8 default source encoding.
|
|
||||||
|
|
||||||
* :pep:`3131`: Non-ASCII identifiers. (However, the standard library remains
|
|
||||||
ASCII-only with the exception of contributor names in comments.)
|
|
||||||
|
|
||||||
* :pep:`3116`: New I/O Implementation. The API is nearly 100% backwards
|
|
||||||
compatible, but completely reimplemented (currently mostly in Python). Also,
|
|
||||||
binary files use bytes instead of strings.
|
|
||||||
|
|
||||||
* The :mod:`StringIO` and :mod:`cStringIO` modules are gone. Instead, import
|
|
||||||
:class:`io.StringIO` or :class:`io.BytesIO`.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
:pep:`3101`: A New Approach To String Formatting
|
:pep:`3101`: A New Approach To String Formatting
|
||||||
================================================
|
================================================
|
||||||
|
|
||||||
* A new system for built-in string formatting operations replaces the
|
* A new system for built-in string formatting operations replaces the
|
||||||
``%`` string formatting operator. (However, the ``%`` operator is
|
``%`` string formatting operator. (However, the ``%`` operator is
|
||||||
still supported; it will be deprecated in Python 3.1 and removed
|
still supported; it will be deprecated in Python 3.1 and removed
|
||||||
from the language at some later time.)
|
from the language at some later time.) Read :pep:`3101` for the full
|
||||||
|
scoop.
|
||||||
.. XXX expand this
|
|
||||||
|
|
||||||
|
|
||||||
:pep:`3106`: Revamping dict :meth:`dict.keys`, :meth:`dict.items` and :meth:`dict.values`
|
:pep:`3106`: Revamping dict :meth:`dict.keys`, :meth:`dict.items` and :meth:`dict.values`
|
||||||
|
@ -632,16 +637,24 @@ Exception Stuff
|
||||||
New Class And Metaclass Stuff
|
New Class And Metaclass Stuff
|
||||||
=============================
|
=============================
|
||||||
|
|
||||||
|
XXX Move to new syntax section???
|
||||||
|
|
||||||
* Classic classes are gone.
|
* Classic classes are gone.
|
||||||
|
|
||||||
* :pep:`3115`: New Metaclass Syntax.
|
* :pep:`3115`: New Metaclass Syntax. Instead of::
|
||||||
|
|
||||||
* :pep:`3119`: Abstract Base Classes (ABCs); ``@abstractmethod`` and
|
class C:
|
||||||
``@abstractproperty`` decorators; collection ABCs.
|
__metaclass__ = M
|
||||||
|
...
|
||||||
|
|
||||||
* :pep:`3129`: Class decorators.
|
you now use::
|
||||||
|
|
||||||
* :pep:`3141`: Numeric ABCs.
|
class C(metaclass=M):
|
||||||
|
...
|
||||||
|
|
||||||
|
The module-global :data:`__metaclass__` variable is no longer supported.
|
||||||
|
(It was a crutch to make it easier to default to new-style classes
|
||||||
|
without deriving every class from :class:`object`.)
|
||||||
|
|
||||||
|
|
||||||
Other Language Changes
|
Other Language Changes
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue