mirror of
https://github.com/python/cpython.git
synced 2025-10-24 15:36:26 +00:00
623 lines
19 KiB
ReStructuredText
623 lines
19 KiB
ReStructuredText
.. _tut-informal:
|
|
|
|
**********************************
|
|
An Informal Introduction to Python
|
|
**********************************
|
|
|
|
In the following examples, input and output are distinguished by the presence or
|
|
absence of prompts (``>>>`` and ``...``): to repeat the example, you must type
|
|
everything after the prompt, when the prompt appears; lines that do not begin
|
|
with a prompt are output from the interpreter. Note that a secondary prompt on a
|
|
line by itself in an example means you must type a blank line; this is used to
|
|
end a multi-line command.
|
|
|
|
Many of the examples in this manual, even those entered at the interactive
|
|
prompt, include comments. Comments in Python start with the hash character,
|
|
``#``, and extend to the end of the physical line. A comment may appear at the
|
|
start of a line or following whitespace or code, but not within a string
|
|
literal. A hash character within a string literal is just a hash character.
|
|
Since comments are to clarify code and are not interpreted by Python, they may
|
|
be omitted when typing in examples.
|
|
|
|
Some examples::
|
|
|
|
# this is the first comment
|
|
SPAM = 1 # and this is the second comment
|
|
# ... and now a third!
|
|
STRING = "# This is not a comment."
|
|
|
|
|
|
.. _tut-calculator:
|
|
|
|
Using Python as a Calculator
|
|
============================
|
|
|
|
Let's try some simple Python commands. Start the interpreter and wait for the
|
|
primary prompt, ``>>>``. (It shouldn't take long.)
|
|
|
|
|
|
.. _tut-numbers:
|
|
|
|
Numbers
|
|
-------
|
|
|
|
The interpreter acts as a simple calculator: you can type an expression at it
|
|
and it will write the value. Expression syntax is straightforward: the
|
|
operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages
|
|
(for example, Pascal or C); parentheses can be used for grouping. For example::
|
|
|
|
>>> 2+2
|
|
4
|
|
>>> # This is a comment
|
|
... 2+2
|
|
4
|
|
>>> 2+2 # and a comment on the same line as code
|
|
4
|
|
>>> (50-5*6)/4
|
|
5.0
|
|
>>> 8/5 # Fractions aren't lost when dividing integers
|
|
1.6
|
|
|
|
Note: You might not see exactly the same result; floating point results can
|
|
differ from one machine to another. We will say more later about controlling
|
|
the appearance of floating point output. See also :ref:`tut-fp-issues` for a
|
|
full discussion of some of the subtleties of floating point numbers and their
|
|
representations.
|
|
|
|
To do integer division and get an integer result,
|
|
discarding any fractional result, there is another operator, ``//``::
|
|
|
|
>>> # Integer division returns the floor:
|
|
... 7//3
|
|
2
|
|
>>> 7//-3
|
|
-3
|
|
|
|
The equal sign (``'='``) is used to assign a value to a variable. Afterwards, no
|
|
result is displayed before the next interactive prompt::
|
|
|
|
>>> width = 20
|
|
>>> height = 5*9
|
|
>>> width * height
|
|
900
|
|
|
|
A value can be assigned to several variables simultaneously::
|
|
|
|
>>> x = y = z = 0 # Zero x, y and z
|
|
>>> x
|
|
0
|
|
>>> y
|
|
0
|
|
>>> z
|
|
0
|
|
|
|
Variables must be "defined" (assigned a value) before they can be used, or an
|
|
error will occur::
|
|
|
|
>>> # try to access an undefined variable
|
|
... n
|
|
Traceback (most recent call last):
|
|
File "<stdin>", line 1, in <module>
|
|
NameError: name 'n' is not defined
|
|
|
|
There is full support for floating point; operators with mixed type operands
|
|
convert the integer operand to floating point::
|
|
|
|
>>> 3 * 3.75 / 1.5
|
|
7.5
|
|
>>> 7.0 / 2
|
|
3.5
|
|
|
|
Complex numbers are also supported; imaginary numbers are written with a suffix
|
|
of ``j`` or ``J``. Complex numbers with a nonzero real component are written as
|
|
``(real+imagj)``, or can be created with the ``complex(real, imag)`` function.
|
|
::
|
|
|
|
>>> 1j * 1J
|
|
(-1+0j)
|
|
>>> 1j * complex(0, 1)
|
|
(-1+0j)
|
|
>>> 3+1j*3
|
|
(3+3j)
|
|
>>> (3+1j)*3
|
|
(9+3j)
|
|
>>> (1+2j)/(1+1j)
|
|
(1.5+0.5j)
|
|
|
|
Complex numbers are always represented as two floating point numbers, the real
|
|
and imaginary part. To extract these parts from a complex number *z*, use
|
|
``z.real`` and ``z.imag``. ::
|
|
|
|
>>> a=1.5+0.5j
|
|
>>> a.real
|
|
1.5
|
|
>>> a.imag
|
|
0.5
|
|
|
|
The conversion functions to floating point and integer (:func:`float`,
|
|
:func:`int`) don't work for complex numbers --- there is not one correct way to
|
|
convert a complex number to a real number. Use ``abs(z)`` to get its magnitude
|
|
(as a float) or ``z.real`` to get its real part::
|
|
|
|
>>> a=3.0+4.0j
|
|
>>> float(a)
|
|
Traceback (most recent call last):
|
|
File "<stdin>", line 1, in ?
|
|
TypeError: can't convert complex to float; use abs(z)
|
|
>>> a.real
|
|
3.0
|
|
>>> a.imag
|
|
4.0
|
|
>>> abs(a) # sqrt(a.real**2 + a.imag**2)
|
|
5.0
|
|
|
|
In interactive mode, the last printed expression is assigned to the variable
|
|
``_``. This means that when you are using Python as a desk calculator, it is
|
|
somewhat easier to continue calculations, for example::
|
|
|
|
>>> tax = 12.5 / 100
|
|
>>> price = 100.50
|
|
>>> price * tax
|
|
12.5625
|
|
>>> price + _
|
|
113.0625
|
|
>>> round(_, 2)
|
|
113.06
|
|
|
|
This variable should be treated as read-only by the user. Don't explicitly
|
|
assign a value to it --- you would create an independent local variable with the
|
|
same name masking the built-in variable with its magic behavior.
|
|
|
|
|
|
.. _tut-strings:
|
|
|
|
Strings
|
|
-------
|
|
|
|
Besides numbers, Python can also manipulate strings, which can be expressed in
|
|
several ways. They can be enclosed in single quotes or double quotes::
|
|
|
|
>>> 'spam eggs'
|
|
'spam eggs'
|
|
>>> 'doesn\'t'
|
|
"doesn't"
|
|
>>> "doesn't"
|
|
"doesn't"
|
|
>>> '"Yes," he said.'
|
|
'"Yes," he said.'
|
|
>>> "\"Yes,\" he said."
|
|
'"Yes," he said.'
|
|
>>> '"Isn\'t," she said.'
|
|
'"Isn\'t," she said.'
|
|
|
|
The interpreter prints the result of string operations in the same way as they
|
|
are typed for input: inside quotes, and with quotes and other funny characters
|
|
escaped by backslashes, to show the precise value. The string is enclosed in
|
|
double quotes if the string contains a single quote and no double quotes, else
|
|
it's enclosed in single quotes. The :func:`print` function produces a more
|
|
readable output for such input strings.
|
|
|
|
String literals can span multiple lines in several ways. Continuation lines can
|
|
be used, with a backslash as the last character on the line indicating that the
|
|
next line is a logical continuation of the line::
|
|
|
|
hello = "This is a rather long string containing\n\
|
|
several lines of text just as you would do in C.\n\
|
|
Note that whitespace at the beginning of the line is\
|
|
significant."
|
|
|
|
print(hello)
|
|
|
|
Note that newlines still need to be embedded in the string using ``\n`` -- the
|
|
newline following the trailing backslash is discarded. This example would print
|
|
the following:
|
|
|
|
.. code-block:: text
|
|
|
|
This is a rather long string containing
|
|
several lines of text just as you would do in C.
|
|
Note that whitespace at the beginning of the line is significant.
|
|
|
|
Or, strings can be surrounded in a pair of matching triple-quotes: ``"""`` or
|
|
``'''``. End of lines do not need to be escaped when using triple-quotes, but
|
|
they will be included in the string. So the following uses one escape to
|
|
avoid an unwanted initial blank line. ::
|
|
|
|
print("""\
|
|
Usage: thingy [OPTIONS]
|
|
-h Display this usage message
|
|
-H hostname Hostname to connect to
|
|
""")
|
|
|
|
produces the following output:
|
|
|
|
.. code-block:: text
|
|
|
|
Usage: thingy [OPTIONS]
|
|
-h Display this usage message
|
|
-H hostname Hostname to connect to
|
|
|
|
If we make the string literal a "raw" string, ``\n`` sequences are not converted
|
|
to newlines, but the backslash at the end of the line, and the newline character
|
|
in the source, are both included in the string as data. Thus, the example::
|
|
|
|
hello = r"This is a rather long string containing\n\
|
|
several lines of text much as you would do in C."
|
|
|
|
print(hello)
|
|
|
|
would print:
|
|
|
|
.. code-block:: text
|
|
|
|
This is a rather long string containing\n\
|
|
several lines of text much as you would do in C.
|
|
|
|
Strings can be concatenated (glued together) with the ``+`` operator, and
|
|
repeated with ``*``::
|
|
|
|
>>> word = 'Help' + 'A'
|
|
>>> word
|
|
'HelpA'
|
|
>>> '<' + word*5 + '>'
|
|
'<HelpAHelpAHelpAHelpAHelpA>'
|
|
|
|
Two string literals next to each other are automatically concatenated; the first
|
|
line above could also have been written ``word = 'Help' 'A'``; this only works
|
|
with two literals, not with arbitrary string expressions::
|
|
|
|
>>> 'str' 'ing' # <- This is ok
|
|
'string'
|
|
>>> 'str'.strip() + 'ing' # <- This is ok
|
|
'string'
|
|
>>> 'str'.strip() 'ing' # <- This is invalid
|
|
File "<stdin>", line 1, in ?
|
|
'str'.strip() 'ing'
|
|
^
|
|
SyntaxError: invalid syntax
|
|
|
|
Strings can be subscripted (indexed); like in C, the first character of a string
|
|
has subscript (index) 0. There is no separate character type; a character is
|
|
simply a string of size one. As in the Icon programming language, substrings
|
|
can be specified with the *slice notation*: two indices separated by a colon.
|
|
::
|
|
|
|
>>> word[4]
|
|
'A'
|
|
>>> word[0:2]
|
|
'He'
|
|
>>> word[2:4]
|
|
'lp'
|
|
|
|
Slice indices have useful defaults; an omitted first index defaults to zero, an
|
|
omitted second index defaults to the size of the string being sliced. ::
|
|
|
|
>>> word[:2] # The first two characters
|
|
'He'
|
|
>>> word[2:] # Everything except the first two characters
|
|
'lpA'
|
|
|
|
Unlike a C string, Python strings cannot be changed. Assigning to an indexed
|
|
position in the string results in an error::
|
|
|
|
>>> word[0] = 'x'
|
|
Traceback (most recent call last):
|
|
File "<stdin>", line 1, in ?
|
|
TypeError: 'str' object does not support item assignment
|
|
>>> word[:1] = 'Splat'
|
|
Traceback (most recent call last):
|
|
File "<stdin>", line 1, in ?
|
|
TypeError: 'str' object does not support slice assignment
|
|
|
|
However, creating a new string with the combined content is easy and efficient::
|
|
|
|
>>> 'x' + word[1:]
|
|
'xelpA'
|
|
>>> 'Splat' + word[4]
|
|
'SplatA'
|
|
|
|
Here's a useful invariant of slice operations: ``s[:i] + s[i:]`` equals ``s``.
|
|
::
|
|
|
|
>>> word[:2] + word[2:]
|
|
'HelpA'
|
|
>>> word[:3] + word[3:]
|
|
'HelpA'
|
|
|
|
Degenerate slice indices are handled gracefully: an index that is too large is
|
|
replaced by the string size, an upper bound smaller than the lower bound returns
|
|
an empty string. ::
|
|
|
|
>>> word[1:100]
|
|
'elpA'
|
|
>>> word[10:]
|
|
''
|
|
>>> word[2:1]
|
|
''
|
|
|
|
Indices may be negative numbers, to start counting from the right. For example::
|
|
|
|
>>> word[-1] # The last character
|
|
'A'
|
|
>>> word[-2] # The last-but-one character
|
|
'p'
|
|
>>> word[-2:] # The last two characters
|
|
'pA'
|
|
>>> word[:-2] # Everything except the last two characters
|
|
'Hel'
|
|
|
|
But note that -0 is really the same as 0, so it does not count from the right!
|
|
::
|
|
|
|
>>> word[-0] # (since -0 equals 0)
|
|
'H'
|
|
|
|
Out-of-range negative slice indices are truncated, but don't try this for
|
|
single-element (non-slice) indices::
|
|
|
|
>>> word[-100:]
|
|
'HelpA'
|
|
>>> word[-10] # error
|
|
Traceback (most recent call last):
|
|
File "<stdin>", line 1, in ?
|
|
IndexError: string index out of range
|
|
|
|
One way to remember how slices work is to think of the indices as pointing
|
|
*between* characters, with the left edge of the first character numbered 0.
|
|
Then the right edge of the last character of a string of *n* characters has
|
|
index *n*, for example::
|
|
|
|
+---+---+---+---+---+
|
|
| H | e | l | p | A |
|
|
+---+---+---+---+---+
|
|
0 1 2 3 4 5
|
|
-5 -4 -3 -2 -1
|
|
|
|
The first row of numbers gives the position of the indices 0...5 in the string;
|
|
the second row gives the corresponding negative indices. The slice from *i* to
|
|
*j* consists of all characters between the edges labeled *i* and *j*,
|
|
respectively.
|
|
|
|
For non-negative indices, the length of a slice is the difference of the
|
|
indices, if both are within bounds. For example, the length of ``word[1:3]`` is
|
|
2.
|
|
|
|
The built-in function :func:`len` returns the length of a string::
|
|
|
|
>>> s = 'supercalifragilisticexpialidocious'
|
|
>>> len(s)
|
|
34
|
|
|
|
|
|
.. seealso::
|
|
|
|
:ref:`typesseq`
|
|
Strings are examples of *sequence types*, and support the common
|
|
operations supported by such types.
|
|
|
|
:ref:`string-methods`
|
|
Strings support a large number of methods for
|
|
basic transformations and searching.
|
|
|
|
:ref:`string-formatting`
|
|
Information about string formatting with :meth:`str.format` is described
|
|
here.
|
|
|
|
:ref:`old-string-formatting`
|
|
The old formatting operations invoked when strings and Unicode strings are
|
|
the left operand of the ``%`` operator are described in more detail here.
|
|
|
|
|
|
.. _tut-unicodestrings:
|
|
|
|
About Unicode
|
|
-------------
|
|
|
|
.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com>
|
|
|
|
|
|
Starting with Python 3.0 all strings support Unicode (see
|
|
http://www.unicode.org/).
|
|
|
|
Unicode has the advantage of providing one ordinal for every character in every
|
|
script used in modern and ancient texts. Previously, there were only 256
|
|
possible ordinals for script characters. Texts were typically bound to a code
|
|
page which mapped the ordinals to script characters. This lead to very much
|
|
confusion especially with respect to internationalization (usually written as
|
|
``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software. Unicode solves
|
|
these problems by defining one code page for all scripts.
|
|
|
|
If you want to include special characters in a string,
|
|
you can do so by using the Python *Unicode-Escape* encoding. The following
|
|
example shows how::
|
|
|
|
>>> 'Hello\u0020World !'
|
|
'Hello World !'
|
|
|
|
The escape sequence ``\u0020`` indicates to insert the Unicode character with
|
|
the ordinal value 0x0020 (the space character) at the given position.
|
|
|
|
Other characters are interpreted by using their respective ordinal values
|
|
directly as Unicode ordinals. If you have literal strings in the standard
|
|
Latin-1 encoding that is used in many Western countries, you will find it
|
|
convenient that the lower 256 characters of Unicode are the same as the 256
|
|
characters of Latin-1.
|
|
|
|
Apart from these standard encodings, Python provides a whole set of other ways
|
|
of creating Unicode strings on the basis of a known encoding.
|
|
|
|
To convert a string into a sequence of bytes using a specific encoding,
|
|
string objects provide an :func:`encode` method that takes one argument, the
|
|
name of the encoding. Lowercase names for encodings are preferred. ::
|
|
|
|
>>> "Äpfel".encode('utf-8')
|
|
b'\xc3\x84pfel'
|
|
|
|
.. _tut-lists:
|
|
|
|
Lists
|
|
-----
|
|
|
|
Python knows a number of *compound* data types, used to group together other
|
|
values. The most versatile is the *list*, which can be written as a list of
|
|
comma-separated values (items) between square brackets. List items need not all
|
|
have the same type. ::
|
|
|
|
>>> a = ['spam', 'eggs', 100, 1234]
|
|
>>> a
|
|
['spam', 'eggs', 100, 1234]
|
|
|
|
Like string indices, list indices start at 0, and lists can be sliced,
|
|
concatenated and so on::
|
|
|
|
>>> a[0]
|
|
'spam'
|
|
>>> a[3]
|
|
1234
|
|
>>> a[-2]
|
|
100
|
|
>>> a[1:-1]
|
|
['eggs', 100]
|
|
>>> a[:2] + ['bacon', 2*2]
|
|
['spam', 'eggs', 'bacon', 4]
|
|
>>> 3*a[:3] + ['Boo!']
|
|
['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boo!']
|
|
|
|
All slice operations return a new list containing the requested elements. This
|
|
means that the following slice returns a shallow copy of the list *a*::
|
|
|
|
>>> a[:]
|
|
['spam', 'eggs', 100, 1234]
|
|
|
|
Unlike strings, which are *immutable*, it is possible to change individual
|
|
elements of a list::
|
|
|
|
>>> a
|
|
['spam', 'eggs', 100, 1234]
|
|
>>> a[2] = a[2] + 23
|
|
>>> a
|
|
['spam', 'eggs', 123, 1234]
|
|
|
|
Assignment to slices is also possible, and this can even change the size of the
|
|
list or clear it entirely::
|
|
|
|
>>> # Replace some items:
|
|
... a[0:2] = [1, 12]
|
|
>>> a
|
|
[1, 12, 123, 1234]
|
|
>>> # Remove some:
|
|
... a[0:2] = []
|
|
>>> a
|
|
[123, 1234]
|
|
>>> # Insert some:
|
|
... a[1:1] = ['bletch', 'xyzzy']
|
|
>>> a
|
|
[123, 'bletch', 'xyzzy', 1234]
|
|
>>> # Insert (a copy of) itself at the beginning
|
|
>>> a[:0] = a
|
|
>>> a
|
|
[123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234]
|
|
>>> # Clear the list: replace all items with an empty list
|
|
>>> a[:] = []
|
|
>>> a
|
|
[]
|
|
|
|
The built-in function :func:`len` also applies to lists::
|
|
|
|
>>> a = ['a', 'b', 'c', 'd']
|
|
>>> len(a)
|
|
4
|
|
|
|
It is possible to nest lists (create lists containing other lists), for
|
|
example::
|
|
|
|
>>> q = [2, 3]
|
|
>>> p = [1, q, 4]
|
|
>>> len(p)
|
|
3
|
|
>>> p[1]
|
|
[2, 3]
|
|
>>> p[1][0]
|
|
2
|
|
|
|
You can add something to the end of the list::
|
|
|
|
>>> p[1].append('xtra')
|
|
>>> p
|
|
[1, [2, 3, 'xtra'], 4]
|
|
>>> q
|
|
[2, 3, 'xtra']
|
|
|
|
Note that in the last example, ``p[1]`` and ``q`` really refer to the same
|
|
object! We'll come back to *object semantics* later.
|
|
|
|
|
|
.. _tut-firststeps:
|
|
|
|
First Steps Towards Programming
|
|
===============================
|
|
|
|
Of course, we can use Python for more complicated tasks than adding two and two
|
|
together. For instance, we can write an initial sub-sequence of the *Fibonacci*
|
|
series as follows::
|
|
|
|
>>> # Fibonacci series:
|
|
... # the sum of two elements defines the next
|
|
... a, b = 0, 1
|
|
>>> while b < 10:
|
|
... print(b)
|
|
... a, b = b, a+b
|
|
...
|
|
1
|
|
1
|
|
2
|
|
3
|
|
5
|
|
8
|
|
|
|
This example introduces several new features.
|
|
|
|
* The first line contains a *multiple assignment*: the variables ``a`` and ``b``
|
|
simultaneously get the new values 0 and 1. On the last line this is used again,
|
|
demonstrating that the expressions on the right-hand side are all evaluated
|
|
first before any of the assignments take place. The right-hand side expressions
|
|
are evaluated from the left to the right.
|
|
|
|
* The :keyword:`while` loop executes as long as the condition (here: ``b < 10``)
|
|
remains true. In Python, like in C, any non-zero integer value is true; zero is
|
|
false. The condition may also be a string or list value, in fact any sequence;
|
|
anything with a non-zero length is true, empty sequences are false. The test
|
|
used in the example is a simple comparison. The standard comparison operators
|
|
are written the same as in C: ``<`` (less than), ``>`` (greater than), ``==``
|
|
(equal to), ``<=`` (less than or equal to), ``>=`` (greater than or equal to)
|
|
and ``!=`` (not equal to).
|
|
|
|
* The *body* of the loop is *indented*: indentation is Python's way of grouping
|
|
statements. Python does not (yet!) provide an intelligent input line editing
|
|
facility, so you have to type a tab or space(s) for each indented line. In
|
|
practice you will prepare more complicated input for Python with a text editor;
|
|
most text editors have an auto-indent facility. When a compound statement is
|
|
entered interactively, it must be followed by a blank line to indicate
|
|
completion (since the parser cannot guess when you have typed the last line).
|
|
Note that each line within a basic block must be indented by the same amount.
|
|
|
|
* The :func:`print` function writes the value of the expression(s) it is
|
|
given. It differs from just writing the expression you want to write (as we did
|
|
earlier in the calculator examples) in the way it handles multiple
|
|
expressions, floating point quantities,
|
|
and strings. Strings are printed without quotes, and a space is inserted
|
|
between items, so you can format things nicely, like this::
|
|
|
|
>>> i = 256*256
|
|
>>> print('The value of i is', i)
|
|
The value of i is 65536
|
|
|
|
The keyword *end* can be used to avoid the newline after the output, or end
|
|
the output with a different string::
|
|
|
|
>>> a, b = 0, 1
|
|
>>> while b < 1000:
|
|
... print(b, end=',')
|
|
... a, b = b, a+b
|
|
...
|
|
1,1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,
|