mirror of
https://github.com/python/cpython.git
synced 2025-07-24 11:44:31 +00:00
Commit #1068: new docs for PEP 3101. Also document the old string formatting as "old", and begin documenting str/unicode unification.
This commit is contained in:
parent
20594ccf07
commit
4b49131f2b
9 changed files with 513 additions and 389 deletions
|
@ -12,8 +12,8 @@ numbers representations in 100% pure Python.
|
|||
|
||||
.. note::
|
||||
|
||||
This module is unnecessary: everything here can be done using the ``%`` string
|
||||
interpolation operator described in the :ref:`string-formatting` section.
|
||||
This module is unnecessary: everything here can be done using the string
|
||||
formatting functions described in the :ref:`string-formatting` section.
|
||||
|
||||
The :mod:`fpformat` module defines the following functions and an exception:
|
||||
|
||||
|
|
|
@ -449,6 +449,22 @@ available. They are listed here in alphabetical order.
|
|||
|
||||
The float type is described in :ref:`typesnumeric`.
|
||||
|
||||
.. function:: format(value[, format_spec])
|
||||
|
||||
.. index::
|
||||
pair: str; format
|
||||
single: __format__
|
||||
|
||||
Convert a string or a number to a "formatted" representation, as controlled
|
||||
by *format_spec*. The interpretation of *format_spec* will depend on the
|
||||
type of the *value* argument, however there is a standard formatting syntax
|
||||
that is used by most built-in types: :ref:`formatspec`.
|
||||
|
||||
.. note::
|
||||
|
||||
``format(value, format_spec)`` merely calls ``value.__format__(format_spec)``.
|
||||
|
||||
|
||||
.. function:: frozenset([iterable])
|
||||
:noindex:
|
||||
|
||||
|
@ -990,10 +1006,9 @@ available. They are listed here in alphabetical order.
|
|||
|
||||
For more information on strings see :ref:`typesseq` which describes sequence
|
||||
functionality (strings are sequences), and also the string-specific methods
|
||||
described in the :ref:`string-methods` section. To output formatted strings
|
||||
use template strings or the ``%`` operator described in the
|
||||
:ref:`string-formatting` section. In addition see the :ref:`stringservices`
|
||||
section. See also :func:`unicode`.
|
||||
described in the :ref:`string-methods` section. To output formatted strings,
|
||||
see the :ref:`string-formatting` section. In addition see the
|
||||
:ref:`stringservices` section.
|
||||
|
||||
|
||||
.. function:: sum(iterable[, start])
|
||||
|
|
|
@ -611,8 +611,10 @@ This time, all messages with a severity of DEBUG or above were handled, and the
|
|||
format of the messages was also changed, and output went to the specified file
|
||||
rather than the console.
|
||||
|
||||
Formatting uses standard Python string formatting - see section
|
||||
:ref:`string-formatting`. The format string takes the following common
|
||||
.. XXX logging should probably be updated!
|
||||
|
||||
Formatting uses the old Python string formatting - see section
|
||||
:ref:`old-string-formatting`. The format string takes the following common
|
||||
specifiers. For a complete list of specifiers, consult the :class:`Formatter`
|
||||
documentation.
|
||||
|
||||
|
@ -1483,7 +1485,7 @@ A Formatter can be initialized with a format string which makes use of knowledge
|
|||
of the :class:`LogRecord` attributes - such as the default value mentioned above
|
||||
making use of the fact that the user's message and arguments are pre-formatted
|
||||
into a :class:`LogRecord`'s *message* attribute. This format string contains
|
||||
standard python %-style mapping keys. See section :ref:`string-formatting`
|
||||
standard python %-style mapping keys. See section :ref:`old-string-formatting`
|
||||
for more information on string formatting.
|
||||
|
||||
Currently, the useful mapping keys in a :class:`LogRecord` are:
|
||||
|
|
|
@ -480,19 +480,18 @@ object) supplying the :meth:`__iter__` and :meth:`__next__` methods.
|
|||
|
||||
.. _typesseq:
|
||||
|
||||
Sequence Types --- :class:`str`, :class:`unicode`, :class:`list`, :class:`tuple`, :class:`buffer`, :class:`range`
|
||||
=================================================================================================================
|
||||
Sequence Types --- :class:`str`, :class:`bytes`, :class:`list`, :class:`tuple`, :class:`buffer`, :class:`range`
|
||||
===============================================================================================================
|
||||
|
||||
There are six sequence types: strings, Unicode strings, lists, tuples, buffers,
|
||||
and range objects.
|
||||
(For other containers see the built in :class:`dict`, :class:`list`,
|
||||
:class:`set`, and :class:`tuple` classes, and the :mod:`collections`
|
||||
module.)
|
||||
|
||||
There are five sequence types: strings, byte sequences, lists, tuples, buffers,
|
||||
and range objects. (For other containers see the built in :class:`dict`,
|
||||
:class:`list`, :class:`set`, and :class:`tuple` classes, and the
|
||||
:mod:`collections` module.)
|
||||
|
||||
.. index::
|
||||
object: sequence
|
||||
object: string
|
||||
object: bytes
|
||||
object: tuple
|
||||
object: list
|
||||
object: buffer
|
||||
|
@ -501,21 +500,32 @@ module.)
|
|||
String literals are written in single or double quotes: ``'xyzzy'``,
|
||||
``"frobozz"``. See :ref:`strings` for more about string literals. In addition
|
||||
to the functionality described here, there are also string-specific methods
|
||||
described in the :ref:`string-methods` section. Lists are constructed with
|
||||
square brackets, separating items with commas: ``[a, b, c]``. Tuples are
|
||||
constructed by the comma operator (not within square brackets), with or without
|
||||
enclosing parentheses, but an empty tuple must have the enclosing parentheses,
|
||||
such as ``a, b, c`` or ``()``. A single item tuple must have a trailing comma,
|
||||
such as ``(d,)``.
|
||||
described in the :ref:`string-methods` section. Bytes objects can be
|
||||
constructed from literals too; use a ``b`` prefix with normal string syntax:
|
||||
``b'xyzzy'``.
|
||||
|
||||
.. caveat::
|
||||
|
||||
While string objects are sequences of characters (represented by strings of
|
||||
length 1), bytes objects are sequences of *integers* (between 0 and 255),
|
||||
representing the ASCII value of single bytes. That means that for a bytes
|
||||
object *b*, ``b[0]`` will be an integer, while ``b[0:1]`` will be a bytes
|
||||
object of length 1.
|
||||
|
||||
Lists are constructed with square brackets, separating items with commas: ``[a,
|
||||
b, c]``. Tuples are constructed by the comma operator (not within square
|
||||
brackets), with or without enclosing parentheses, but an empty tuple must have
|
||||
the enclosing parentheses, such as ``a, b, c`` or ``()``. A single item tuple
|
||||
must have a trailing comma, such as ``(d,)``.
|
||||
|
||||
Buffer objects are not directly supported by Python syntax, but can be created
|
||||
by calling the builtin function :func:`buffer`. They don't support
|
||||
concatenation or repetition.
|
||||
|
||||
Objects of type range are similar to buffers in that there is no specific syntax to
|
||||
create them, but they are created using the :func:`range` function. They don't
|
||||
support slicing, concatenation or repetition, and using ``in``, ``not in``,
|
||||
:func:`min` or :func:`max` on them is inefficient.
|
||||
Objects of type range are similar to buffers in that there is no specific syntax
|
||||
to create them, but they are created using the :func:`range` function. They
|
||||
don't support slicing, concatenation or repetition, and using ``in``, ``not
|
||||
in``, :func:`min` or :func:`max` on them is inefficient.
|
||||
|
||||
Most sequence types support the following operations. The ``in`` and ``not in``
|
||||
operations have the same priorities as the comparison operations. The ``+`` and
|
||||
|
@ -555,12 +565,11 @@ are sequences of the same type; *n*, *i* and *j* are integers:
|
|||
| ``max(s)`` | largest item of *s* | |
|
||||
+------------------+--------------------------------+----------+
|
||||
|
||||
Sequence types also support comparisons. In particular, tuples and lists
|
||||
are compared lexicographically by comparing corresponding
|
||||
elements. This means that to compare equal, every element must compare
|
||||
equal and the two sequences must be of the same type and have the same
|
||||
length. (For full details see :ref:`comparisons` in the language
|
||||
reference.)
|
||||
Sequence types also support comparisons. In particular, tuples and lists are
|
||||
compared lexicographically by comparing corresponding elements. This means that
|
||||
to compare equal, every element must compare equal and the two sequences must be
|
||||
of the same type and have the same length. (For full details see
|
||||
:ref:`comparisons` in the language reference.)
|
||||
|
||||
.. index::
|
||||
triple: operations on; sequence; types
|
||||
|
@ -578,10 +587,8 @@ reference.)
|
|||
Notes:
|
||||
|
||||
(1)
|
||||
When *s* is a string or Unicode string object the ``in`` and ``not in``
|
||||
operations act like a substring test. In Python versions before 2.3, *x* had to
|
||||
be a string of length 1. In Python 2.3 and beyond, *x* may be a string of any
|
||||
length.
|
||||
When *s* is a string object, the ``in`` and ``not in`` operations act like a
|
||||
substring test.
|
||||
|
||||
(2)
|
||||
Values of *n* less than ``0`` are treated as ``0`` (which yields an empty
|
||||
|
@ -642,6 +649,8 @@ Notes:
|
|||
Formerly, string concatenation never occurred in-place.
|
||||
|
||||
|
||||
.. XXX add bytes methods
|
||||
|
||||
.. _string-methods:
|
||||
|
||||
String Methods
|
||||
|
@ -649,19 +658,15 @@ String Methods
|
|||
|
||||
.. index:: pair: string; methods
|
||||
|
||||
Below are listed the string methods which both 8-bit strings and Unicode objects
|
||||
support. In addition, Python's strings support the sequence type methods
|
||||
described in the :ref:`typesseq` section. To output formatted strings
|
||||
use template strings or the ``%`` operator described in the
|
||||
:ref:`string-formatting` section. Also, see the :mod:`re` module for
|
||||
string functions based on regular expressions.
|
||||
String objects support the methods listed below. In addition, Python's strings
|
||||
support the sequence type methods described in the :ref:`typesseq` section. To
|
||||
output formatted strings, see the :ref:`string-formatting` section. Also, see
|
||||
the :mod:`re` module for string functions based on regular expressions.
|
||||
|
||||
.. method:: str.capitalize()
|
||||
|
||||
Return a copy of the string with only its first character capitalized.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.center(width[, fillchar])
|
||||
|
||||
|
@ -679,6 +684,7 @@ string functions based on regular expressions.
|
|||
slice notation.
|
||||
|
||||
|
||||
.. XXX what about str.decode???
|
||||
.. method:: str.decode([encoding[, errors]])
|
||||
|
||||
Decodes the string using the codec registered for *encoding*. *encoding*
|
||||
|
@ -737,6 +743,24 @@ string functions based on regular expressions.
|
|||
found.
|
||||
|
||||
|
||||
.. method:: str.format(format_string, *args, **ksargs)
|
||||
|
||||
Perform a string formatting operation. The *format_string* argument can
|
||||
contain literal text or replacement fields delimited by braces ``{}``. Each
|
||||
replacement field contains either the numeric index of a positional argument,
|
||||
or the name of a keyword argument. Returns a copy of *format_string* where
|
||||
each replacement field is replaced with the string value of the corresponding
|
||||
argument.
|
||||
|
||||
>>> "The sum of 1 + 2 is {0}".format(1+2)
|
||||
'The sum of 1 + 2 is 3'
|
||||
|
||||
See :ref:`formatstrings` for a description of the various formatting options
|
||||
that can be specified in format strings.
|
||||
|
||||
.. versionadded:: 3.0
|
||||
|
||||
|
||||
.. method:: str.index(sub[, start[, end]])
|
||||
|
||||
Like :meth:`find`, but raise :exc:`ValueError` when the substring is not found.
|
||||
|
@ -747,31 +771,23 @@ string functions based on regular expressions.
|
|||
Return true if all characters in the string are alphanumeric and there is at
|
||||
least one character, false otherwise.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.isalpha()
|
||||
|
||||
Return true if all characters in the string are alphabetic and there is at least
|
||||
one character, false otherwise.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.isdigit()
|
||||
|
||||
Return true if all characters in the string are digits and there is at least one
|
||||
character, false otherwise.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.isidentifier()
|
||||
|
||||
Return true if the string is a valid identifier according to the language
|
||||
definition.
|
||||
|
||||
.. XXX link to the definition?
|
||||
definition, section :ref:`identifiers`.
|
||||
|
||||
|
||||
.. method:: str.islower()
|
||||
|
@ -779,16 +795,12 @@ string functions based on regular expressions.
|
|||
Return true if all cased characters in the string are lowercase and there is at
|
||||
least one cased character, false otherwise.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.isspace()
|
||||
|
||||
Return true if there are only whitespace characters in the string and there is
|
||||
at least one character, false otherwise.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.istitle()
|
||||
|
||||
|
@ -796,16 +808,12 @@ string functions based on regular expressions.
|
|||
character, for example uppercase characters may only follow uncased characters
|
||||
and lowercase characters only cased ones. Return false otherwise.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.isupper()
|
||||
|
||||
Return true if all cased characters in the string are uppercase and there is at
|
||||
least one cased character, false otherwise.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.join(seq)
|
||||
|
||||
|
@ -827,8 +835,6 @@ string functions based on regular expressions.
|
|||
|
||||
Return a copy of the string converted to lowercase.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.lstrip([chars])
|
||||
|
||||
|
@ -984,50 +990,31 @@ string functions based on regular expressions.
|
|||
Return a copy of the string with uppercase characters converted to lowercase and
|
||||
vice versa.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.title()
|
||||
|
||||
Return a titlecased version of the string: words start with uppercase
|
||||
characters, all remaining cased characters are lowercase.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
.. method:: str.translate(map)
|
||||
|
||||
.. method:: str.translate(table[, deletechars])
|
||||
Returns a copy of the *s* where all characters have been mapped through the
|
||||
*map* which must be a mapping of Unicode ordinals (integers) to Unicode
|
||||
ordinals, strings or ``None``. Unmapped characters are left
|
||||
untouched. Characters mapped to ``None`` are deleted.
|
||||
|
||||
Return a copy of the string where all characters occurring in the optional
|
||||
argument *deletechars* are removed, and the remaining characters have been
|
||||
mapped through the given translation table, which must be a string of length
|
||||
256.
|
||||
.. note::
|
||||
|
||||
You can use the :func:`maketrans` helper function in the :mod:`string` module to
|
||||
create a translation table. For string objects, set the *table* argument to
|
||||
``None`` for translations that only delete characters::
|
||||
|
||||
>>> 'read this short text'.translate(None, 'aeiou')
|
||||
'rd ths shrt txt'
|
||||
|
||||
.. versionadded:: 2.6
|
||||
Support for a ``None`` *table* argument.
|
||||
|
||||
For Unicode objects, the :meth:`translate` method does not accept the optional
|
||||
*deletechars* argument. Instead, it returns a copy of the *s* where all
|
||||
characters have been mapped through the given translation table which must be a
|
||||
mapping of Unicode ordinals to Unicode ordinals, Unicode strings or ``None``.
|
||||
Unmapped characters are left untouched. Characters mapped to ``None`` are
|
||||
deleted. Note, a more flexible approach is to create a custom character mapping
|
||||
codec using the :mod:`codecs` module (see :mod:`encodings.cp1251` for an
|
||||
example).
|
||||
A more flexible approach is to create a custom character mapping codec
|
||||
using the :mod:`codecs` module (see :mod:`encodings.cp1251` for an
|
||||
example).
|
||||
|
||||
|
||||
.. method:: str.upper()
|
||||
|
||||
Return a copy of the string converted to uppercase.
|
||||
|
||||
For 8-bit strings, this method is locale-dependent.
|
||||
|
||||
|
||||
.. method:: str.zfill(width)
|
||||
|
||||
|
@ -1037,10 +1024,10 @@ string functions based on regular expressions.
|
|||
.. versionadded:: 2.2.2
|
||||
|
||||
|
||||
.. _string-formatting:
|
||||
.. _old-string-formatting:
|
||||
|
||||
String Formatting Operations
|
||||
----------------------------
|
||||
Old String Formatting Operations
|
||||
--------------------------------
|
||||
|
||||
.. index::
|
||||
single: formatting, string (%)
|
||||
|
@ -1052,14 +1039,18 @@ String Formatting Operations
|
|||
single: % formatting
|
||||
single: % interpolation
|
||||
|
||||
String and Unicode objects have one unique built-in operation: the ``%``
|
||||
operator (modulo). This is also known as the string *formatting* or
|
||||
*interpolation* operator. Given ``format % values`` (where *format* is a string
|
||||
or Unicode object), ``%`` conversion specifications in *format* are replaced
|
||||
with zero or more elements of *values*. The effect is similar to the using
|
||||
:cfunc:`sprintf` in the C language. If *format* is a Unicode object, or if any
|
||||
of the objects being converted using the ``%s`` conversion are Unicode objects,
|
||||
the result will also be a Unicode object.
|
||||
.. XXX better?
|
||||
|
||||
.. note::
|
||||
|
||||
The formatting operations described here are obsolete and my go away in future
|
||||
versions of Python. Use the new :ref:`string-formatting` in new code.
|
||||
|
||||
String objects have one unique built-in operation: the ``%`` operator (modulo).
|
||||
This is also known as the string *formatting* or *interpolation* operator.
|
||||
Given ``format % values`` (where *format* is a string), ``%`` conversion
|
||||
specifications in *format* are replaced with zero or more elements of *values*.
|
||||
The effect is similar to the using :cfunc:`sprintf` in the C language.
|
||||
|
||||
If *format* requires a single argument, *values* may be a single non-tuple
|
||||
object. [#]_ Otherwise, *values* must be a tuple with exactly the number of
|
||||
|
@ -1164,7 +1155,7 @@ The conversion types are:
|
|||
| ``'r'`` | String (converts any python object using | \(5) |
|
||||
| | :func:`repr`). | |
|
||||
+------------+-----------------------------------------------------+-------+
|
||||
| ``'s'`` | String (converts any python object using | \(6) |
|
||||
| ``'s'`` | String (converts any python object using | |
|
||||
| | :func:`str`). | |
|
||||
+------------+-----------------------------------------------------+-------+
|
||||
| ``'%'`` | No argument is converted, results in a ``'%'`` | |
|
||||
|
@ -1203,9 +1194,6 @@ Notes:
|
|||
|
||||
The precision determines the maximal number of characters used.
|
||||
|
||||
(6)
|
||||
If the object or format provided is a :class:`unicode` string, the resulting
|
||||
string will also be :class:`unicode`.
|
||||
|
||||
The precision determines the maximal number of characters used.
|
||||
|
||||
|
@ -2019,6 +2007,7 @@ the particular object.
|
|||
on all file-like objects.
|
||||
|
||||
|
||||
.. XXX does this still apply?
|
||||
.. attribute:: file.encoding
|
||||
|
||||
The encoding that this file uses. When Unicode strings are written to a file,
|
||||
|
|
|
@ -8,15 +8,13 @@
|
|||
|
||||
.. index:: module: re
|
||||
|
||||
The :mod:`string` module contains a number of useful constants and
|
||||
classes, as well as some deprecated legacy functions that are also
|
||||
available as methods on strings. In addition, Python's built-in string
|
||||
classes support the sequence type methods described in the
|
||||
:ref:`typesseq` section, and also the string-specific methods described
|
||||
in the :ref:`string-methods` section. To output formatted strings use
|
||||
template strings or the ``%`` operator described in the
|
||||
:ref:`string-formatting` section. Also, see the :mod:`re` module for
|
||||
string functions based on regular expressions.
|
||||
The :mod:`string` module contains a number of useful constants and classes, as
|
||||
well as some deprecated legacy functions that are also available as methods on
|
||||
strings. In addition, Python's built-in string classes support the sequence type
|
||||
methods described in the :ref:`typesseq` section, and also the string-specific
|
||||
methods described in the :ref:`string-methods` section. To output formatted
|
||||
strings, see the :ref:`string-formatting` section. Also, see the :mod:`re`
|
||||
module for string functions based on regular expressions.
|
||||
|
||||
|
||||
String constants
|
||||
|
@ -78,6 +76,354 @@ The constants defined in this module are:
|
|||
vertical tab.
|
||||
|
||||
|
||||
.. _string-formatting:
|
||||
|
||||
String Formatting
|
||||
-----------------
|
||||
|
||||
Starting in Python 3.0, the built-in string class provides the ability to do
|
||||
complex variable substitutions and value formatting via the :func:`format`
|
||||
method described in :pep:`3101`. The :class:`Formatter` class in the
|
||||
:mod:`string` module allows you to create and customize your own string
|
||||
formatting behaviors using the same implementation as the built-in
|
||||
:meth:`format` method.
|
||||
|
||||
.. class:: Formatter
|
||||
|
||||
The :class:`Formatter` class has the following public methods:
|
||||
|
||||
.. method:: format(format_string, *args, *kwargs)
|
||||
|
||||
:meth:`format` is the primary API method. It takes a format template
|
||||
string, and an arbitrary set of positional and keyword argument.
|
||||
:meth:`format` is just a wrapper that calls :meth:`vformat`.
|
||||
|
||||
.. method:: vformat(format_string, args, kwargs)
|
||||
|
||||
This function does the actual work of formatting. It is exposed as a
|
||||
separate function for cases where you want to pass in a predefined
|
||||
dictionary of arguments, rather than unpacking and repacking the
|
||||
dictionary as individual arguments using the ``*args`` and ``**kwds``
|
||||
syntax. :meth:`vformat` does the work of breaking up the format template
|
||||
string into character data and replacement fields. It calls the various
|
||||
methods described below.
|
||||
|
||||
In addition, the :class:`Formatter` defines a number of methods that are
|
||||
intended to be replaced by subclasses:
|
||||
|
||||
.. method:: parse(format_string)
|
||||
|
||||
Loop over the format_string and return an iterable of tuples
|
||||
(*literal_text*, *field_name*, *format_spec*, *conversion*). This is used
|
||||
by :meth:`vformat` to break the string in to either literal text, or
|
||||
replacement fields.
|
||||
|
||||
The values in the tuple conceptually represent a span of literal text
|
||||
followed by a single replacement field. If there is no literal text
|
||||
(which can happen if two replacement fields occur consecutively), then
|
||||
*literal_text* will be a zero-length string. If there is no replacement
|
||||
field, then the values of *field_name*, *format_spec* and *conversion*
|
||||
will be ``None``.
|
||||
|
||||
.. method:: get_field(field_name, args, kwargs, used_args)
|
||||
|
||||
Given *field_name* as returned by :meth:`parse` (see above), convert it to
|
||||
an object to be formatted. The default version takes strings of the form
|
||||
defined in :pep:`3101`, such as "0[name]" or "label.title". It records
|
||||
which args have been used in *used_args*. *args* and *kwargs* are as
|
||||
passed in to :meth:`vformat`.
|
||||
|
||||
.. method:: get_value(key, args, kwargs)
|
||||
|
||||
Retrieve a given field value. The *key* argument will be either an
|
||||
integer or a string. If it is an integer, it represents the index of the
|
||||
positional argument in *args*; if it is a string, then it represents a
|
||||
named argument in *kwargs*.
|
||||
|
||||
The *args* parameter is set to the list of positional arguments to
|
||||
:meth:`vformat`, and the *kwargs* parameter is set to the dictionary of
|
||||
keyword arguments.
|
||||
|
||||
For compound field names, these functions are only called for the first
|
||||
component of the field name; Subsequent components are handled through
|
||||
normal attribute and indexing operations.
|
||||
|
||||
So for example, the field expression '0.name' would cause
|
||||
:meth:`get_value` to be called with a *key* argument of 0. The ``name``
|
||||
attribute will be looked up after :meth:`get_value` returns by calling the
|
||||
built-in :func:`getattr` function.
|
||||
|
||||
If the index or keyword refers to an item that does not exist, then an
|
||||
:exc:`IndexError` or :exc:`KeyError` should be raised.
|
||||
|
||||
.. method:: check_unused_args(used_args, args, kwargs)
|
||||
|
||||
Implement checking for unused arguments if desired. The arguments to this
|
||||
function is the set of all argument keys that were actually referred to in
|
||||
the format string (integers for positional arguments, and strings for
|
||||
named arguments), and a reference to the *args* and *kwargs* that was
|
||||
passed to vformat. The set of unused args can be calculated from these
|
||||
parameters. :meth:`check_unused_args` is assumed to throw an exception if
|
||||
the check fails.
|
||||
|
||||
.. method:: format_field(value, format_spec)
|
||||
|
||||
:meth:`format_field` simply calls the global :func:`format` built-in. The
|
||||
method is provided so that subclasses can override it.
|
||||
|
||||
.. method:: convert_field(value, conversion)
|
||||
|
||||
Converts the value (returned by :meth:`get_field`) given a conversion type
|
||||
(as in the tuple returned by the :meth:`parse` method.) The default
|
||||
version understands 'r' (repr) and 's' (str) conversion types.
|
||||
|
||||
.. versionadded:: 3.0
|
||||
|
||||
.. _formatstrings:
|
||||
|
||||
Format String Syntax
|
||||
--------------------
|
||||
|
||||
The :meth:`str.format` method and the :class:`Formatter` class share the same
|
||||
syntax for format strings (although in the case of :class:`Formatter`,
|
||||
subclasses can define their own format string syntax.)
|
||||
|
||||
Format strings contain "replacement fields" surrounded by curly braces ``{}``.
|
||||
Anything that is not contained in braces is considered literal text, which is
|
||||
copied unchanged to the output. If you need to include a brace character in the
|
||||
literal text, it can be escaped by doubling: ``{{`` and ``}}``.
|
||||
|
||||
The grammar for a replacement field is as follows:
|
||||
|
||||
.. productionlist:: sf
|
||||
replacement_field: "{" `field_name` ["!" `conversion`] [":" `format_spec`] "}"
|
||||
field_name: (`identifier` | `integer`) ("." `attribute_name` | "[" element_index "]")*
|
||||
attribute_name: `identifier`
|
||||
element_index: `integer`
|
||||
conversion: "r" | "s"
|
||||
format_spec: <described in the next section>
|
||||
|
||||
In less formal terms, the replacement field starts with a *field_name*, which
|
||||
can either be a number (for a positional argument), or an identifier (for
|
||||
keyword arguments). Following this is an optional *conversion* field, which is
|
||||
preceded by an exclamation point ``'!'``, and a *format_spec*, which is preceded
|
||||
by a colon ``':'``.
|
||||
|
||||
The *field_name* itself begins with either a number or a keyword. If it's a
|
||||
number, it refers to a positional argument, and if it's a keyword it refers to a
|
||||
named keyword argument. This can be followed by any number of index or
|
||||
attribute expressions. An expression of the form ``'.name'`` selects the named
|
||||
attribute using :func:`getattr`, while an expression of the form ``'[index]'``
|
||||
does an index lookup using :func:`__getitem__`.
|
||||
|
||||
Some simple format string examples::
|
||||
|
||||
"First, thou shalt count to {0}" # References first positional argument
|
||||
"My quest is {name}" # References keyword argument 'name'
|
||||
"Weight in tons {0.weight}" # 'weight' attribute of first positional arg
|
||||
"Units destroyed: {players[0]}" # First element of keyword argument 'players'.
|
||||
|
||||
The *conversion* field causes a type coercion before formatting. Normally, the
|
||||
job of formatting a value is done by the :meth:`__format__` method of the value
|
||||
itself. However, in some cases it is desirable to force a type to be formatted
|
||||
as a string, overriding its own definition of formatting. By converting the
|
||||
value to a string before calling :meth:`__format__`, the normal formatting logic
|
||||
is bypassed.
|
||||
|
||||
Two conversion flags are currently supported: ``'!s'`` which calls :func:`str()`
|
||||
on the value, and ``'!r'`` which calls :func:`repr()`.
|
||||
|
||||
Some examples::
|
||||
|
||||
"Harold's a clever {0!s}" # Calls str() on the argument first
|
||||
"Bring out the holy {name!r}" # Calls repr() on the argument first
|
||||
|
||||
The *format_spec* field contains a specification of how the value should be
|
||||
presented, including such details as field width, alignment, padding, decimal
|
||||
precision and so on. Each value type can define it's own "formatting
|
||||
mini-language" or interpretation of the *format_spec*.
|
||||
|
||||
Most built-in types support a common formatting mini-language, which is
|
||||
described in the next section.
|
||||
|
||||
A *format_spec* field can also include nested replacement fields within it.
|
||||
These nested replacement fields can contain only a field name; conversion flags
|
||||
and format specifications are not allowed. The replacement fields within the
|
||||
format_spec are substituted before the *format_spec* string is interpreted.
|
||||
This allows the formatting of a value to be dynamically specified.
|
||||
|
||||
For example, suppose you wanted to have a replacement field whose field width is
|
||||
determined by another variable::
|
||||
|
||||
"A man with two {0:{1}}".format("noses", 10)
|
||||
|
||||
This would first evaluate the inner replacement field, making the format string
|
||||
effectively::
|
||||
|
||||
"A man with two {0:10}"
|
||||
|
||||
Then the outer replacement field would be evaluated, producing::
|
||||
|
||||
"noses "
|
||||
|
||||
Which is subsitituted into the string, yielding::
|
||||
|
||||
"A man with two noses "
|
||||
|
||||
(The extra space is because we specified a field width of 10, and because left
|
||||
alignment is the default for strings.)
|
||||
|
||||
.. versionadded:: 3.0
|
||||
|
||||
.. _formatspec:
|
||||
|
||||
Format Specification Mini-Language
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
"Format specifications" are used within replacement fields contained within a
|
||||
format string to define how individual values are presented (see
|
||||
:ref:`formatstrings`.) They can also be passed directly to the builtin
|
||||
:func:`format` function. Each formattable type may define how the format
|
||||
specification is to be interpreted.
|
||||
|
||||
Most built-in types implement the following options for format specifications,
|
||||
although some of the formatting options are only supported by the numeric types.
|
||||
|
||||
A general convention is that an empty format string (``""``) produces the same
|
||||
result as if you had called :func:`str()` on the value.
|
||||
|
||||
The general form of a *standard format specifier* is:
|
||||
|
||||
.. productionlist:: sf
|
||||
format_spec: [[`fill`]`align`][`sign`][0][`width`][.`precision`][`type`]
|
||||
fill: <a character other than '}'>
|
||||
align: "<" | ">" | "=" | "^"
|
||||
sign: "+" | "-" | " "
|
||||
width: `integer`
|
||||
precision: `integer`
|
||||
type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "x" | "X" | "%"
|
||||
|
||||
The *fill* character can be any character other than '}' (which signifies the
|
||||
end of the field). The presence of a fill character is signaled by the *next*
|
||||
character, which must be one of the alignment options. If the second character
|
||||
of *format_spec* is not a valid alignment option, then it is assumed that both
|
||||
the fill character and the alignment option are absent.
|
||||
|
||||
The meaning of the various alignment options is as follows:
|
||||
|
||||
+---------+----------------------------------------------------------+
|
||||
| Option | Meaning |
|
||||
+=========+==========================================================+
|
||||
| ``'<'`` | Forces the field to be left-aligned within the available |
|
||||
| | space (This is the default.) |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'>'`` | Forces the field to be right-aligned within the |
|
||||
| | available space. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'='`` | Forces the padding to be placed after the sign (if any) |
|
||||
| | but before the digits. This is used for printing fields |
|
||||
| | in the form '+000000120'. This alignment option is only |
|
||||
| | valid for numeric types. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'^'`` | Forces the field to be centered within the available |
|
||||
| | space. |
|
||||
+---------+----------------------------------------------------------+
|
||||
|
||||
Note that unless a minimum field width is defined, the field width will always
|
||||
be the same size as the data to fill it, so that the alignment option has no
|
||||
meaning in this case.
|
||||
|
||||
The *sign* option is only valid for number types, and can be one of the
|
||||
following:
|
||||
|
||||
+---------+----------------------------------------------------------+
|
||||
| Option | Meaning |
|
||||
+=========+==========================================================+
|
||||
| ``'+'`` | indicates that a sign should be used for both |
|
||||
| | positive as well as negative numbers. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'-'`` | indicates that a sign should be used only for negative |
|
||||
| | numbers (this is the default behavior). |
|
||||
+---------+----------------------------------------------------------+
|
||||
| space | indicates that a leading space should be used on |
|
||||
| | positive numbers, and a minus sign on negative numbers. |
|
||||
+---------+----------------------------------------------------------+
|
||||
|
||||
*width* is a decimal integer defining the minimum field width. If not
|
||||
specified, then the field width will be determined by the content.
|
||||
|
||||
If the *width* field is preceded by a zero (``'0'``) character, this enables
|
||||
zero-padding. This is equivalent to an *alignment* type of ``'='`` and a *fill*
|
||||
character of ``'0'``.
|
||||
|
||||
The *precision* is a decimal number indicating how many digits should be
|
||||
displayed after the decimal point for a floating point value. For non-number
|
||||
types the field indicates the maximum field size - in other words, how many
|
||||
characters will be used from the field content. The *precision* is ignored for
|
||||
integer values.
|
||||
|
||||
Finally, the *type* determines how the data should be presented.
|
||||
|
||||
The available integer presentation types are:
|
||||
|
||||
+---------+----------------------------------------------------------+
|
||||
| Type | Meaning |
|
||||
+=========+==========================================================+
|
||||
| ``'b'`` | Binary. Outputs the number in base 2. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'c'`` | Character. Converts the integer to the corresponding |
|
||||
| | unicode character before printing. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'d'`` | Decimal Integer. Outputs the number in base 10. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'o'`` | Octal format. Outputs the number in base 8. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'x'`` | Hex format. Outputs the number in base 16, using lower- |
|
||||
| | case letters for the digits above 9. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'X'`` | Hex format. Outputs the number in base 16, using upper- |
|
||||
| | case letters for the digits above 9. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| None | the same as ``'d'`` |
|
||||
+---------+----------------------------------------------------------+
|
||||
|
||||
The available presentation types for floating point and decimal values are:
|
||||
|
||||
+---------+----------------------------------------------------------+
|
||||
| Type | Meaning |
|
||||
+=========+==========================================================+
|
||||
| ``'e'`` | Exponent notation. Prints the number in scientific |
|
||||
| | notation using the letter 'e' to indicate the exponent. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'E'`` | Exponent notation. Same as ``'e'`` except it uses an |
|
||||
| | upper case 'E' as the separator character. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'f'`` | Fixed point. Displays the number as a fixed-point |
|
||||
| | number. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'F'`` | Fixed point. Same as ``'f'``. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'g'`` | General format. This prints the number as a fixed-point |
|
||||
| | number, unless the number is too large, in which case |
|
||||
| | it switches to ``'e'`` exponent notation. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'G'`` | General format. Same as ``'g'`` except switches to |
|
||||
| | ``'E'`` if the number gets to large. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'n'`` | Number. This is the same as ``'g'``, except that it uses |
|
||||
| | the current locale setting to insert the appropriate |
|
||||
| | number separator characters. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| ``'%'`` | Percentage. Multiplies the number by 100 and displays |
|
||||
| | in fixed (``'f'``) format, followed by a percent sign. |
|
||||
+---------+----------------------------------------------------------+
|
||||
| None | similar to ``'g'``, except that it prints at least one |
|
||||
| | digit after the decimal point. |
|
||||
+---------+----------------------------------------------------------+
|
||||
|
||||
|
||||
.. _template-strings:
|
||||
|
||||
Template strings
|
||||
----------------
|
||||
|
||||
|
@ -208,6 +554,7 @@ They are not available as string methods.
|
|||
leading and trailing whitespace.
|
||||
|
||||
|
||||
.. XXX is obsolete with unicode.translate
|
||||
.. function:: maketrans(from, to)
|
||||
|
||||
Return a translation table suitable for passing to :func:`translate`, that will
|
||||
|
@ -219,250 +566,3 @@ They are not available as string methods.
|
|||
Don't use strings derived from :const:`lowercase` and :const:`uppercase` as
|
||||
arguments; in some locales, these don't have the same length. For case
|
||||
conversions, always use :func:`lower` and :func:`upper`.
|
||||
|
||||
|
||||
Deprecated string functions
|
||||
---------------------------
|
||||
|
||||
The following list of functions are also defined as methods of string and
|
||||
Unicode objects; see section :ref:`string-methods` for more information on
|
||||
those. You should consider these functions as deprecated, although they will
|
||||
not be removed until Python 3.0. The functions defined in this module are:
|
||||
|
||||
|
||||
.. function:: atof(s)
|
||||
|
||||
.. deprecated:: 2.0
|
||||
Use the :func:`float` built-in function.
|
||||
|
||||
.. index:: builtin: float
|
||||
|
||||
Convert a string to a floating point number. The string must have the standard
|
||||
syntax for a floating point literal in Python, optionally preceded by a sign
|
||||
(``+`` or ``-``). Note that this behaves identical to the built-in function
|
||||
:func:`float` when passed a string.
|
||||
|
||||
.. note::
|
||||
|
||||
.. index::
|
||||
single: NaN
|
||||
single: Infinity
|
||||
|
||||
When passing in a string, values for NaN and Infinity may be returned, depending
|
||||
on the underlying C library. The specific set of strings accepted which cause
|
||||
these values to be returned depends entirely on the C library and is known to
|
||||
vary.
|
||||
|
||||
|
||||
.. function:: atoi(s[, base])
|
||||
|
||||
.. deprecated:: 2.0
|
||||
Use the :func:`int` built-in function.
|
||||
|
||||
.. index:: builtin: eval
|
||||
|
||||
Convert string *s* to an integer in the given *base*. The string must consist
|
||||
of one or more digits, optionally preceded by a sign (``+`` or ``-``). The
|
||||
*base* defaults to 10. If it is 0, a default base is chosen depending on the
|
||||
leading characters of the string (after stripping the sign): ``0x`` or ``0X``
|
||||
means 16, ``0`` means 8, anything else means 10. If *base* is 16, a leading
|
||||
``0x`` or ``0X`` is always accepted, though not required. This behaves
|
||||
identically to the built-in function :func:`int` when passed a string. (Also
|
||||
note: for a more flexible interpretation of numeric literals, use the built-in
|
||||
function :func:`eval`.)
|
||||
|
||||
|
||||
.. function:: atol(s[, base])
|
||||
|
||||
.. deprecated:: 2.0
|
||||
Use the :func:`long` built-in function.
|
||||
|
||||
.. index:: builtin: long
|
||||
|
||||
Convert string *s* to a long integer in the given *base*. The string must
|
||||
consist of one or more digits, optionally preceded by a sign (``+`` or ``-``).
|
||||
The *base* argument has the same meaning as for :func:`atoi`. A trailing ``l``
|
||||
or ``L`` is not allowed, except if the base is 0. Note that when invoked
|
||||
without *base* or with *base* set to 10, this behaves identical to the built-in
|
||||
function :func:`long` when passed a string.
|
||||
|
||||
|
||||
.. function:: capitalize(word)
|
||||
|
||||
Return a copy of *word* with only its first character capitalized.
|
||||
|
||||
|
||||
.. function:: expandtabs(s[, tabsize])
|
||||
|
||||
Expand tabs in a string replacing them by one or more spaces, depending on the
|
||||
current column and the given tab size. The column number is reset to zero after
|
||||
each newline occurring in the string. This doesn't understand other non-printing
|
||||
characters or escape sequences. The tab size defaults to 8.
|
||||
|
||||
|
||||
.. function:: find(s, sub[, start[,end]])
|
||||
|
||||
Return the lowest index in *s* where the substring *sub* is found such that
|
||||
*sub* is wholly contained in ``s[start:end]``. Return ``-1`` on failure.
|
||||
Defaults for *start* and *end* and interpretation of negative values is the same
|
||||
as for slices.
|
||||
|
||||
|
||||
.. function:: rfind(s, sub[, start[, end]])
|
||||
|
||||
Like :func:`find` but find the highest index.
|
||||
|
||||
|
||||
.. function:: index(s, sub[, start[, end]])
|
||||
|
||||
Like :func:`find` but raise :exc:`ValueError` when the substring is not found.
|
||||
|
||||
|
||||
.. function:: rindex(s, sub[, start[, end]])
|
||||
|
||||
Like :func:`rfind` but raise :exc:`ValueError` when the substring is not found.
|
||||
|
||||
|
||||
.. function:: count(s, sub[, start[, end]])
|
||||
|
||||
Return the number of (non-overlapping) occurrences of substring *sub* in string
|
||||
``s[start:end]``. Defaults for *start* and *end* and interpretation of negative
|
||||
values are the same as for slices.
|
||||
|
||||
|
||||
.. function:: lower(s)
|
||||
|
||||
Return a copy of *s*, but with upper case letters converted to lower case.
|
||||
|
||||
|
||||
.. function:: split(s[, sep[, maxsplit]])
|
||||
|
||||
Return a list of the words of the string *s*. If the optional second argument
|
||||
*sep* is absent or ``None``, the words are separated by arbitrary strings of
|
||||
whitespace characters (space, tab, newline, return, formfeed). If the second
|
||||
argument *sep* is present and not ``None``, it specifies a string to be used as
|
||||
the word separator. The returned list will then have one more item than the
|
||||
number of non-overlapping occurrences of the separator in the string. The
|
||||
optional third argument *maxsplit* defaults to 0. If it is nonzero, at most
|
||||
*maxsplit* number of splits occur, and the remainder of the string is returned
|
||||
as the final element of the list (thus, the list will have at most
|
||||
``maxsplit+1`` elements).
|
||||
|
||||
The behavior of split on an empty string depends on the value of *sep*. If *sep*
|
||||
is not specified, or specified as ``None``, the result will be an empty list.
|
||||
If *sep* is specified as any string, the result will be a list containing one
|
||||
element which is an empty string.
|
||||
|
||||
|
||||
.. function:: rsplit(s[, sep[, maxsplit]])
|
||||
|
||||
Return a list of the words of the string *s*, scanning *s* from the end. To all
|
||||
intents and purposes, the resulting list of words is the same as returned by
|
||||
:func:`split`, except when the optional third argument *maxsplit* is explicitly
|
||||
specified and nonzero. When *maxsplit* is nonzero, at most *maxsplit* number of
|
||||
splits -- the *rightmost* ones -- occur, and the remainder of the string is
|
||||
returned as the first element of the list (thus, the list will have at most
|
||||
``maxsplit+1`` elements).
|
||||
|
||||
.. versionadded:: 2.4
|
||||
|
||||
|
||||
.. function:: splitfields(s[, sep[, maxsplit]])
|
||||
|
||||
This function behaves identically to :func:`split`. (In the past, :func:`split`
|
||||
was only used with one argument, while :func:`splitfields` was only used with
|
||||
two arguments.)
|
||||
|
||||
|
||||
.. function:: join(words[, sep])
|
||||
|
||||
Concatenate a list or tuple of words with intervening occurrences of *sep*.
|
||||
The default value for *sep* is a single space character. It is always true that
|
||||
``string.join(string.split(s, sep), sep)`` equals *s*.
|
||||
|
||||
|
||||
.. function:: joinfields(words[, sep])
|
||||
|
||||
This function behaves identically to :func:`join`. (In the past, :func:`join`
|
||||
was only used with one argument, while :func:`joinfields` was only used with two
|
||||
arguments.) Note that there is no :meth:`joinfields` method on string objects;
|
||||
use the :meth:`join` method instead.
|
||||
|
||||
|
||||
.. function:: lstrip(s[, chars])
|
||||
|
||||
Return a copy of the string with leading characters removed. If *chars* is
|
||||
omitted or ``None``, whitespace characters are removed. If given and not
|
||||
``None``, *chars* must be a string; the characters in the string will be
|
||||
stripped from the beginning of the string this method is called on.
|
||||
|
||||
.. versionchanged:: 2.2.3
|
||||
The *chars* parameter was added. The *chars* parameter cannot be passed in
|
||||
earlier 2.2 versions.
|
||||
|
||||
|
||||
.. function:: rstrip(s[, chars])
|
||||
|
||||
Return a copy of the string with trailing characters removed. If *chars* is
|
||||
omitted or ``None``, whitespace characters are removed. If given and not
|
||||
``None``, *chars* must be a string; the characters in the string will be
|
||||
stripped from the end of the string this method is called on.
|
||||
|
||||
.. versionchanged:: 2.2.3
|
||||
The *chars* parameter was added. The *chars* parameter cannot be passed in
|
||||
earlier 2.2 versions.
|
||||
|
||||
|
||||
.. function:: strip(s[, chars])
|
||||
|
||||
Return a copy of the string with leading and trailing characters removed. If
|
||||
*chars* is omitted or ``None``, whitespace characters are removed. If given and
|
||||
not ``None``, *chars* must be a string; the characters in the string will be
|
||||
stripped from the both ends of the string this method is called on.
|
||||
|
||||
.. versionchanged:: 2.2.3
|
||||
The *chars* parameter was added. The *chars* parameter cannot be passed in
|
||||
earlier 2.2 versions.
|
||||
|
||||
|
||||
.. function:: swapcase(s)
|
||||
|
||||
Return a copy of *s*, but with lower case letters converted to upper case and
|
||||
vice versa.
|
||||
|
||||
|
||||
.. function:: translate(s, table[, deletechars])
|
||||
|
||||
Delete all characters from *s* that are in *deletechars* (if present), and then
|
||||
translate the characters using *table*, which must be a 256-character string
|
||||
giving the translation for each character value, indexed by its ordinal. If
|
||||
*table* is ``None``, then only the character deletion step is performed.
|
||||
|
||||
|
||||
.. function:: upper(s)
|
||||
|
||||
Return a copy of *s*, but with lower case letters converted to upper case.
|
||||
|
||||
|
||||
.. function:: ljust(s, width)
|
||||
rjust(s, width)
|
||||
center(s, width)
|
||||
|
||||
These functions respectively left-justify, right-justify and center a string in
|
||||
a field of given width. They return a string that is at least *width*
|
||||
characters wide, created by padding the string *s* with spaces until the given
|
||||
width on the right, left or both sides. The string is never truncated.
|
||||
|
||||
|
||||
.. function:: zfill(s, width)
|
||||
|
||||
Pad a numeric string on the left with zero digits until the given width is
|
||||
reached. Strings starting with a sign are handled correctly.
|
||||
|
||||
|
||||
.. function:: replace(str, old, new[, maxreplace])
|
||||
|
||||
Return a copy of string *str* with all occurrences of substring *old* replaced
|
||||
by *new*. If the optional argument *maxreplace* is given, the first
|
||||
*maxreplace* occurrences are replaced.
|
||||
|
||||
|
|
|
@ -8,12 +8,11 @@ String Services
|
|||
The modules described in this chapter provide a wide range of string
|
||||
manipulation operations.
|
||||
|
||||
In addition, Python's built-in string classes support the sequence type
|
||||
methods described in the :ref:`typesseq` section, and also the
|
||||
string-specific methods described in the :ref:`string-methods` section.
|
||||
To output formatted strings use template strings or the ``%`` operator
|
||||
described in the :ref:`string-formatting` section. Also, see the
|
||||
:mod:`re` module for string functions based on regular expressions.
|
||||
In addition, Python's built-in string classes support the sequence type methods
|
||||
described in the :ref:`typesseq` section, and also the string-specific methods
|
||||
described in the :ref:`string-methods` section. To output formatted strings,
|
||||
see the :ref:`string-formatting` section. Also, see the :mod:`re` module for
|
||||
string functions based on regular expressions.
|
||||
|
||||
|
||||
.. toctree::
|
||||
|
|
|
@ -1279,15 +1279,36 @@ Basic customization
|
|||
|
||||
.. index::
|
||||
builtin: str
|
||||
statement: print
|
||||
builtin: print
|
||||
|
||||
Called by the :func:`str` built-in function and by the :keyword:`print`
|
||||
statement to compute the "informal" string representation of an object. This
|
||||
Called by the :func:`str` built-in function and by the :func:`print`
|
||||
function to compute the "informal" string representation of an object. This
|
||||
differs from :meth:`__repr__` in that it does not have to be a valid Python
|
||||
expression: a more convenient or concise representation may be used instead.
|
||||
The return value must be a string object.
|
||||
|
||||
|
||||
.. method:: object.__format__(self, format_spec)
|
||||
|
||||
.. index::
|
||||
pair: string; conversion
|
||||
builtin: str
|
||||
builtin: print
|
||||
|
||||
Called by the :func:`format` built-in function (and by extension, the
|
||||
:meth:`format` method of class :class:`str`) to produce a "formatted"
|
||||
string representation of an object. The ``format_spec`` argument is
|
||||
a string that contains a description of the formatting options desired.
|
||||
The interpretation of the ``format_spec`` argument is up to the type
|
||||
implementing :meth:`__format__`, however most classes will either
|
||||
delegate formatting to one of the built-in types, or use a similar
|
||||
formatting option syntax.
|
||||
|
||||
See :ref:`formatspec` for a description of the standard formatting syntax.
|
||||
|
||||
The return value must be a string object.
|
||||
|
||||
|
||||
.. method:: object.__lt__(self, other)
|
||||
object.__le__(self, other)
|
||||
object.__eq__(self, other)
|
||||
|
|
|
@ -5,12 +5,10 @@
|
|||
Expressions
|
||||
***********
|
||||
|
||||
.. index:: single: expression
|
||||
.. index:: expression, BNF
|
||||
|
||||
This chapter explains the meaning of the elements of expressions in Python.
|
||||
|
||||
.. index:: single: BNF
|
||||
|
||||
**Syntax Notes:** In this and the following chapters, extended BNF notation will
|
||||
be used to describe syntax, not lexical analysis. When (one alternative of) a
|
||||
syntax rule has the form
|
||||
|
@ -18,8 +16,6 @@ syntax rule has the form
|
|||
.. productionlist:: *
|
||||
name: `othername`
|
||||
|
||||
.. index:: single: syntax
|
||||
|
||||
and no semantics are given, the semantics of this form of ``name`` are the same
|
||||
as for ``othername``.
|
||||
|
||||
|
@ -852,9 +848,9 @@ identities hold approximately where ``x/y`` is replaced by ``floor(x/y)`` or
|
|||
``floor(x/y) - 1`` [#]_.
|
||||
|
||||
In addition to performing the modulo operation on numbers, the ``%`` operator is
|
||||
also overloaded by string and unicode objects to perform string formatting (also
|
||||
also overloaded by string objects to perform string formatting (also
|
||||
known as interpolation). The syntax for string formatting is described in the
|
||||
Python Library Reference, section :ref:`string-formatting`.
|
||||
Python Library Reference, section :ref:`old-string-formatting`.
|
||||
|
||||
The floor division operator, the modulo operator, and the :func:`divmod`
|
||||
function are not defined for complex numbers. Instead, convert to a
|
||||
|
@ -985,9 +981,12 @@ Comparison of objects of the same type depends on the type:
|
|||
|
||||
* Numbers are compared arithmetically.
|
||||
|
||||
* Bytes objects are compared lexicographically using the numeric values of
|
||||
their elements.
|
||||
|
||||
* Strings are compared lexicographically using the numeric equivalents (the
|
||||
result of the built-in function :func:`ord`) of their characters. Unicode and
|
||||
8-bit strings are fully interoperable in this behavior. [#]_
|
||||
result of the built-in function :func:`ord`) of their characters. [#]_
|
||||
String and bytes object can't be compared!
|
||||
|
||||
* Tuples and lists are compared lexicographically using comparison of
|
||||
corresponding elements. This means that to compare equal, each element must
|
||||
|
@ -1020,11 +1019,10 @@ particular, dictionaries support membership testing as a nicer way of spelling
|
|||
For the list and tuple types, ``x in y`` is true if and only if there exists an
|
||||
index *i* such that ``x == y[i]`` is true.
|
||||
|
||||
For the Unicode and string types, ``x in y`` is true if and only if *x* is a
|
||||
substring of *y*. An equivalent test is ``y.find(x) != -1``. Note, *x* and *y*
|
||||
need not be the same type; consequently, ``u'ab' in 'abc'`` will return
|
||||
``True``. Empty strings are always considered to be a substring of any other
|
||||
string, so ``"" in "abc"`` will return ``True``.
|
||||
For the string and bytes types, ``x in y`` is true if and only if *x* is a
|
||||
substring of *y*. An equivalent test is ``y.find(x) != -1``. Empty strings are
|
||||
always considered to be a substring of any other string, so ``"" in "abc"`` will
|
||||
return ``True``.
|
||||
|
||||
.. versionchanged:: 2.3
|
||||
Previously, *x* was required to be a string of length ``1``.
|
||||
|
@ -1272,7 +1270,7 @@ groups from right to left).
|
|||
cases, Python returns the latter result, in order to preserve that
|
||||
``divmod(x,y)[0] * y + x % y`` be very close to ``x``.
|
||||
|
||||
.. [#] While comparisons between unicode strings make sense at the byte
|
||||
.. [#] While comparisons between strings make sense at the byte
|
||||
level, they may be counter-intuitive to users. For example, the
|
||||
strings ``u"\u00C7"`` and ``u"\u0327\u0043"`` compare differently,
|
||||
even though they both represent the same unicode character (LATIN
|
||||
|
|
|
@ -399,8 +399,8 @@ The built-in function :func:`len` returns the length of a string::
|
|||
basic transformations and searching.
|
||||
|
||||
:ref:`string-formatting`
|
||||
The formatting operations invoked when strings are the
|
||||
left operand of the ``%`` operator are described in more detail here.
|
||||
The formatting operations invoked by the :meth:`format` string method are
|
||||
described in more detail here.
|
||||
|
||||
|
||||
.. _tut-unicodestrings:
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue