mirror of
				https://github.com/python/cpython.git
				synced 2025-11-03 19:34:08 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			3167 lines
		
	
	
	
		
			108 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
			
		
		
	
	
			3167 lines
		
	
	
	
		
			108 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
\documentstyle[twoside,11pt,myformat]{report}
 | 
						|
 | 
						|
\title{Python Tutorial}
 | 
						|
 | 
						|
\input{boilerplate}
 | 
						|
 | 
						|
\begin{document}
 | 
						|
 | 
						|
\pagenumbering{roman}
 | 
						|
 | 
						|
\maketitle
 | 
						|
 | 
						|
\input{copyright}
 | 
						|
 | 
						|
\begin{abstract}
 | 
						|
 | 
						|
\noindent
 | 
						|
Python is a simple, yet powerful programming language that bridges the
 | 
						|
gap between C and shell programming, and is thus ideally suited for
 | 
						|
``throw-away programming''
 | 
						|
and rapid prototyping.  Its syntax is put
 | 
						|
together from constructs borrowed from a variety of other languages;
 | 
						|
most prominent are influences from ABC, C, Modula-3 and Icon.
 | 
						|
 | 
						|
The Python interpreter is easily extended with new functions and data
 | 
						|
types implemented in C.  Python is also suitable as an extension
 | 
						|
language for highly customizable C applications such as editors or
 | 
						|
window managers.
 | 
						|
 | 
						|
Python is available for various operating systems, amongst which
 | 
						|
several flavors of {\UNIX}, Amoeba, the Apple Macintosh O.S.,
 | 
						|
and MS-DOS.
 | 
						|
 | 
						|
This tutorial introduces the reader informally to the basic concepts
 | 
						|
and features of the Python language and system.  It helps to have a
 | 
						|
Python interpreter handy for hands-on experience, but as the examples
 | 
						|
are self-contained, the tutorial can be read off-line as well.
 | 
						|
 | 
						|
For a description of standard objects and modules, see the {\em Python
 | 
						|
Library Reference} document.  The {\em Python Reference Manual} gives
 | 
						|
a more formal definition of the language.
 | 
						|
 | 
						|
\end{abstract}
 | 
						|
 | 
						|
\pagebreak
 | 
						|
{
 | 
						|
\parskip = 0mm
 | 
						|
\tableofcontents
 | 
						|
}
 | 
						|
 | 
						|
\pagebreak
 | 
						|
 | 
						|
\pagenumbering{arabic}
 | 
						|
 | 
						|
 | 
						|
\chapter{Whetting Your Appetite}
 | 
						|
 | 
						|
If you ever wrote a large shell script, you probably know this
 | 
						|
feeling: you'd love to add yet another feature, but it's already so
 | 
						|
slow, and so big, and so complicated; or the feature involves a system
 | 
						|
call or other function that is only accessible from C \ldots  Usually
 | 
						|
the problem at hand isn't serious enough to warrant rewriting the
 | 
						|
script in C; perhaps because the problem requires variable-length
 | 
						|
strings or other data types (like sorted lists of file names) that are
 | 
						|
easy in the shell but lots of work to implement in C; or perhaps just
 | 
						|
because you're not sufficiently familiar with C.
 | 
						|
 | 
						|
In such cases, Python may be just the language for you.  Python is
 | 
						|
simple to use, but it is a real programming language, offering much
 | 
						|
more structure and support for large programs than the shell has.  On
 | 
						|
the other hand, it also offers much more error checking than C, and,
 | 
						|
being a {\em very-high-level language}, it has high-level data types
 | 
						|
built in, such as flexible arrays and dictionaries that would cost you
 | 
						|
days to implement efficiently in C.  Because of its more general data
 | 
						|
types Python is applicable to a much larger problem domain than {\em
 | 
						|
Awk} or even {\em Perl}, yet many things are at least as easy in
 | 
						|
Python as in those languages.
 | 
						|
 | 
						|
Python allows you to split up your program in modules that can be
 | 
						|
reused in other Python programs.  It comes with a large collection of
 | 
						|
standard modules that you can use as the basis of your programs --- or
 | 
						|
as examples to start learning to program in Python.  There are also
 | 
						|
built-in modules that provide things like file I/O, system calls,
 | 
						|
sockets, and even a generic interface to window systems (STDWIN).
 | 
						|
 | 
						|
Python is an interpreted language, which can save you considerable time
 | 
						|
during program development because no compilation and linking is
 | 
						|
necessary.  The interpreter can be used interactively, which makes it
 | 
						|
easy to experiment with features of the language, to write throw-away
 | 
						|
programs, or to test functions during bottom-up program development.
 | 
						|
It is also a handy desk calculator.
 | 
						|
 | 
						|
Python allows writing very compact and readable programs.  Programs
 | 
						|
written in Python are typically much shorter than equivalent C
 | 
						|
programs, for several reasons:
 | 
						|
\begin{itemize}
 | 
						|
\item
 | 
						|
the high-level data types allow you to express complex operations in a
 | 
						|
single statement;
 | 
						|
\item
 | 
						|
statement grouping is done by indentation instead of begin/end
 | 
						|
brackets;
 | 
						|
\item
 | 
						|
no variable or argument declarations are necessary.
 | 
						|
\end{itemize}
 | 
						|
 | 
						|
Python is {\em extensible}: if you know how to program in C it is easy
 | 
						|
to add a new built-in
 | 
						|
function or
 | 
						|
module to the interpreter, either to
 | 
						|
perform critical operations at maximum speed, or to link Python
 | 
						|
programs to libraries that may only be available in binary form (such
 | 
						|
as a vendor-specific graphics library).  Once you are really hooked,
 | 
						|
you can link the Python interpreter into an application written in C
 | 
						|
and use it as an extension or command language for that application.
 | 
						|
 | 
						|
By the way, the language is named after the BBC show ``Monty
 | 
						|
Python's Flying Circus'' and has nothing to do with nasty reptiles...
 | 
						|
 | 
						|
\section{Where From Here}
 | 
						|
 | 
						|
Now that you are all excited about Python, you'll want to examine it
 | 
						|
in some more detail.  Since the best way to learn a language is
 | 
						|
using it, you are invited here to do so.
 | 
						|
 | 
						|
In the next chapter, the mechanics of using the interpreter are
 | 
						|
explained.  This is rather mundane information, but essential for
 | 
						|
trying out the examples shown later.
 | 
						|
 | 
						|
The rest of the tutorial introduces various features of the Python
 | 
						|
language and system though examples, beginning with simple
 | 
						|
expressions, statements and data types, through functions and modules,
 | 
						|
and finally touching upon advanced concepts like exceptions
 | 
						|
and user-defined classes.
 | 
						|
 | 
						|
When you're through with the tutorial (or just getting bored), you
 | 
						|
should read the Library Reference, which gives complete (though terse)
 | 
						|
reference material about built-in and standard types, functions and
 | 
						|
modules that can save you a lot of time when writing Python programs.
 | 
						|
 | 
						|
 | 
						|
\chapter{Using the Python Interpreter}
 | 
						|
 | 
						|
\section{Invoking the Interpreter}
 | 
						|
 | 
						|
The Python interpreter is usually installed as {\tt /usr/local/bin/python}
 | 
						|
on those machines where it is available; putting {\tt /usr/local/bin} in
 | 
						|
your {\UNIX} shell's search path makes it possible to start it by
 | 
						|
typing the command
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
python
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
to the shell.  Since the choice of the directory where the interpreter
 | 
						|
lives is an installation option, other places are possible; check with
 | 
						|
your local Python guru or system administrator.  (E.g., {\tt
 | 
						|
/usr/local/python} is a popular alternative location.)
 | 
						|
 | 
						|
The interpreter operates somewhat like the {\UNIX} shell: when called
 | 
						|
with standard input connected to a tty device, it reads and executes
 | 
						|
commands interactively; when called with a file name argument or with
 | 
						|
a file as standard input, it reads and executes a {\em script} from
 | 
						|
that file.
 | 
						|
 | 
						|
A third way of starting the interpreter is
 | 
						|
``{\tt python -c command [arg] ...}'', which
 | 
						|
executes the statement(s) in {\tt command}, analogous to the shell's
 | 
						|
{\tt -c} option.  Since Python statements often contain spaces or other
 | 
						|
characters that are special to the shell, it is best to quote {\tt
 | 
						|
command} in its entirety with double quotes.
 | 
						|
 | 
						|
Note that there is a difference between ``{\tt python file}'' and
 | 
						|
``{\tt python $<$file}''.  In the latter case, input requests from the
 | 
						|
program, such as calls to {\tt input()} and {\tt raw_input()}, are
 | 
						|
satisfied from {\em file}.  Since this file has already been read
 | 
						|
until the end by the parser before the program starts executing, the
 | 
						|
program will encounter EOF immediately.  In the former case (which is
 | 
						|
usually what you want) they are satisfied from whatever file or device
 | 
						|
is connected to standard input of the Python interpreter.
 | 
						|
 | 
						|
When a script file is used, it is sometimes useful to be able to run
 | 
						|
the script and enter interactive mode afterwards.  This can be done by
 | 
						|
passing {\tt -i} before the script.  (This does not work if the script
 | 
						|
is read from standard input, for the same reason as explained in the
 | 
						|
previous paragraph.)
 | 
						|
 | 
						|
\subsection{Argument Passing}
 | 
						|
 | 
						|
When known to the interpreter, the script name and additional
 | 
						|
arguments thereafter are passed to the script in the variable {\tt
 | 
						|
sys.argv}, which is a list of strings.  Its length is at least one;
 | 
						|
when no script and no arguments are given, {\tt sys.argv[0]} is an
 | 
						|
empty string.  When the script name is given as {\tt '-'} (meaning
 | 
						|
standard input), {\tt sys.argv[0]} is set to {\tt '-'}.  When {\tt -c
 | 
						|
command} is used, {\tt sys.argv[0]} is set to {\tt '-c'}.  Options
 | 
						|
found after {\tt -c command} are not consumed by the Python
 | 
						|
interpreter's option processing but left in {\tt sys.argv} for the
 | 
						|
command to handle.
 | 
						|
 | 
						|
\subsection{Interactive Mode}
 | 
						|
 | 
						|
When commands are read from a tty, the interpreter is said to be in
 | 
						|
{\em interactive\ mode}.  In this mode it prompts for the next command
 | 
						|
with the {\em primary\ prompt}, usually three greater-than signs ({\tt
 | 
						|
>>>}); for continuation lines it prompts with the {\em secondary\
 | 
						|
prompt}, by default three dots ({\tt ...}).  Typing an EOF (Control-D)
 | 
						|
at the primary prompt causes the interpreter to exit with a zero exit
 | 
						|
status.
 | 
						|
 | 
						|
The interpreter prints a welcome message stating its version number
 | 
						|
and a copyright notice before printing the first prompt, e.g.:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
python
 | 
						|
Python 1.1 (Oct  6 1994)
 | 
						|
Copyright 1991-1994 Stichting Mathematisch Centrum, Amsterdam
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{The Interpreter and its Environment}
 | 
						|
 | 
						|
\subsection{Error Handling}
 | 
						|
 | 
						|
When an error occurs, the interpreter prints an error
 | 
						|
message and a stack trace.  In interactive mode, it then returns to
 | 
						|
the primary prompt; when input came from a file, it exits with a
 | 
						|
nonzero exit status after printing
 | 
						|
the stack trace.  (Exceptions handled by an {\tt except} clause in a
 | 
						|
{\tt try} statement are not errors in this context.)  Some errors are
 | 
						|
unconditionally fatal and cause an exit with a nonzero exit; this
 | 
						|
applies to internal inconsistencies and some cases of running out of
 | 
						|
memory.  All error messages are written to the standard error stream;
 | 
						|
normal output from the executed commands is written to standard
 | 
						|
output.
 | 
						|
 | 
						|
Typing the interrupt character (usually Control-C or DEL) to the
 | 
						|
primary or secondary prompt cancels the input and returns to the
 | 
						|
primary prompt.%
 | 
						|
\footnote{
 | 
						|
        A problem with the GNU Readline package may prevent this.
 | 
						|
}
 | 
						|
Typing an interrupt while a command is executing raises the {\tt
 | 
						|
KeyboardInterrupt} exception, which may be handled by a {\tt try}
 | 
						|
statement.
 | 
						|
 | 
						|
\subsection{The Module Search Path}
 | 
						|
 | 
						|
When a module named {\tt spam} is imported, the interpreter searches
 | 
						|
for a file named {\tt spam.py} in the list of directories specified by
 | 
						|
the environment variable {\tt PYTHONPATH}.  It has the same syntax as
 | 
						|
the {\UNIX} shell variable {\tt PATH}, i.e., a list of colon-separated
 | 
						|
directory names.  When {\tt PYTHONPATH} is not set, or when the file
 | 
						|
is not found there, the search continues in an installation-dependent
 | 
						|
default path, usually {\tt .:/usr/local/lib/python}.
 | 
						|
 | 
						|
Actually, modules are searched in the list of directories given by the
 | 
						|
variable {\tt sys.path} which is initialized from {\tt PYTHONPATH} and
 | 
						|
the installation-dependent default.  This allows Python programs that
 | 
						|
know what they're doing to modify or replace the module search path.
 | 
						|
See the section on Standard Modules later.
 | 
						|
 | 
						|
\subsection{``Compiled'' Python files}
 | 
						|
 | 
						|
As an important speed-up of the start-up time for short programs that
 | 
						|
use a lot of standard modules, if a file called {\tt spam.pyc} exists
 | 
						|
in the directory where {\tt spam.py} is found, this is assumed to
 | 
						|
contain an already-``compiled'' version of the module {\tt spam}.  The
 | 
						|
modification time of the version of {\tt spam.py} used to create {\tt
 | 
						|
spam.pyc} is recorded in {\tt spam.pyc}, and the file is ignored if
 | 
						|
these don't match.
 | 
						|
 | 
						|
Whenever {\tt spam.py} is successfully compiled, an attempt is made to
 | 
						|
write the compiled version to {\tt spam.pyc}.  It is not an error if
 | 
						|
this attempt fails; if for any reason the file is not written
 | 
						|
completely, the resulting {\tt spam.pyc} file will be recognized as
 | 
						|
invalid and thus ignored later.
 | 
						|
 | 
						|
\subsection{Executable Python scripts}
 | 
						|
 | 
						|
On BSD'ish {\UNIX} systems, Python scripts can be made directly
 | 
						|
executable, like shell scripts, by putting the line
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
#! /usr/local/bin/python
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
(assuming that's the name of the interpreter) at the beginning of the
 | 
						|
script and giving the file an executable mode.  The {\tt \#!} must be
 | 
						|
the first two characters of the file.
 | 
						|
 | 
						|
\subsection{The Interactive Startup File}
 | 
						|
 | 
						|
When you use Python interactively, it is frequently handy to have some
 | 
						|
standard commands executed every time the interpreter is started.  You
 | 
						|
can do this by setting an environment variable named {\tt
 | 
						|
PYTHONSTARTUP} to the name of a file containing your start-up
 | 
						|
commands.  This is similar to the {\tt .profile} feature of the UNIX
 | 
						|
shells.
 | 
						|
 | 
						|
This file is only read in interactive sessions, not when Python reads
 | 
						|
commands from a script, and not when {\tt /dev/tty} is given as the
 | 
						|
explicit source of commands (which otherwise behaves like an
 | 
						|
interactive session).  It is executed in the same name space where
 | 
						|
interactive commands are executed, so that objects that it defines or
 | 
						|
imports can be used without qualification in the interactive session.
 | 
						|
You can also change the prompts {\tt sys.ps1} and {\tt sys.ps2} in
 | 
						|
this file.
 | 
						|
 | 
						|
If you want to read an additional start-up file from the current
 | 
						|
directory, you can program this in the global start-up file, e.g.
 | 
						|
\verb\execfile('.pythonrc')\.  If you want to use the startup file
 | 
						|
in a script, you must write this explicitly in the script, e.g.
 | 
						|
\verb\import os;\ \verb\execfile(os.environ['PYTHONSTARTUP'])\.
 | 
						|
 | 
						|
\section{Interactive Input Editing and History Substitution}
 | 
						|
 | 
						|
Some versions of the Python interpreter support editing of the current
 | 
						|
input line and history substitution, similar to facilities found in
 | 
						|
the Korn shell and the GNU Bash shell.  This is implemented using the
 | 
						|
{\em GNU\ Readline} library, which supports Emacs-style and vi-style
 | 
						|
editing.  This library has its own documentation which I won't
 | 
						|
duplicate here; however, the basics are easily explained.
 | 
						|
 | 
						|
Perhaps the quickest check to see whether command line editing is
 | 
						|
supported is typing Control-P to the first Python prompt you get.  If
 | 
						|
it beeps, you have command line editing.  If nothing appears to
 | 
						|
happen, or if \verb/^P/ is echoed, you can skip the rest of this
 | 
						|
section.
 | 
						|
 | 
						|
\subsection{Line Editing}
 | 
						|
 | 
						|
If supported, input line editing is active whenever the interpreter
 | 
						|
prints a primary or secondary prompt.  The current line can be edited
 | 
						|
using the conventional Emacs control characters.  The most important
 | 
						|
of these are: C-A (Control-A) moves the cursor to the beginning of the
 | 
						|
line, C-E to the end, C-B moves it one position to the left, C-F to
 | 
						|
the right.  Backspace erases the character to the left of the cursor,
 | 
						|
C-D the character to its right.  C-K kills (erases) the rest of the
 | 
						|
line to the right of the cursor, C-Y yanks back the last killed
 | 
						|
string.  C-underscore undoes the last change you made; it can be
 | 
						|
repeated for cumulative effect.
 | 
						|
 | 
						|
\subsection{History Substitution}
 | 
						|
 | 
						|
History substitution works as follows.  All non-empty input lines
 | 
						|
issued are saved in a history buffer, and when a new prompt is given
 | 
						|
you are positioned on a new line at the bottom of this buffer.  C-P
 | 
						|
moves one line up (back) in the history buffer, C-N moves one down.
 | 
						|
Any line in the history buffer can be edited; an asterisk appears in
 | 
						|
front of the prompt to mark a line as modified.  Pressing the Return
 | 
						|
key passes the current line to the interpreter.  C-R starts an
 | 
						|
incremental reverse search; C-S starts a forward search.
 | 
						|
 | 
						|
\subsection{Key Bindings}
 | 
						|
 | 
						|
The key bindings and some other parameters of the Readline library can
 | 
						|
be customized by placing commands in an initialization file called
 | 
						|
{\tt \$HOME/.inputrc}.  Key bindings have the form
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
key-name: function-name
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
or
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
"string": function-name
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
and options can be set with
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
set option-name value
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
# I prefer vi-style editing:
 | 
						|
set editing-mode vi
 | 
						|
# Edit using a single line:
 | 
						|
set horizontal-scroll-mode On
 | 
						|
# Rebind some keys:
 | 
						|
Meta-h: backward-kill-word
 | 
						|
"\C-u": universal-argument
 | 
						|
"\C-x\C-r": re-read-init-file
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Note that the default binding for TAB in Python is to insert a TAB
 | 
						|
instead of Readline's default filename completion function.  If you
 | 
						|
insist, you can override this by putting
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
TAB: complete
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
in your {\tt \$HOME/.inputrc}.  (Of course, this makes it hard to type
 | 
						|
indented continuation lines...)
 | 
						|
 | 
						|
\subsection{Commentary}
 | 
						|
 | 
						|
This facility is an enormous step forward compared to previous
 | 
						|
versions of the interpreter; however, some wishes are left: It would
 | 
						|
be nice if the proper indentation were suggested on continuation lines
 | 
						|
(the parser knows if an indent token is required next).  The
 | 
						|
completion mechanism might use the interpreter's symbol table.  A
 | 
						|
command to check (or even suggest) matching parentheses, quotes etc.
 | 
						|
would also be useful.
 | 
						|
 | 
						|
 | 
						|
\chapter{An Informal Introduction to Python}
 | 
						|
 | 
						|
In the following examples, input and output are distinguished by the
 | 
						|
presence or absence of prompts ({\tt >>>} and {\tt ...}): to repeat
 | 
						|
the example, you must type everything after the prompt, when the
 | 
						|
prompt appears; lines that do not begin with a prompt are output from
 | 
						|
the interpreter.%
 | 
						|
\footnote{
 | 
						|
        I'd prefer to use different fonts to distinguish input
 | 
						|
        from output, but the amount of LaTeX hacking that would require
 | 
						|
        is currently beyond my ability.
 | 
						|
}
 | 
						|
Note that a secondary prompt on a line by itself in an example means
 | 
						|
you must type a blank line; this is used to end a multi-line command.
 | 
						|
 | 
						|
\section{Using Python as a Calculator}
 | 
						|
 | 
						|
Let's try some simple Python commands.  Start the interpreter and wait
 | 
						|
for the primary prompt, {\tt >>>}.  (It shouldn't take long.)
 | 
						|
 | 
						|
\subsection{Numbers}
 | 
						|
 | 
						|
The interpreter acts as a simple calculator: you can type an
 | 
						|
expression at it and it will write the value.  Expression syntax is
 | 
						|
straightforward: the operators {\tt +}, {\tt -}, {\tt *} and {\tt /}
 | 
						|
work just like in most other languages (e.g., Pascal or C); parentheses
 | 
						|
can be used for grouping.  For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> 2+2
 | 
						|
4
 | 
						|
>>> # This is a comment
 | 
						|
... 2+2
 | 
						|
4
 | 
						|
>>> 2+2  # and a comment on the same line as code
 | 
						|
4
 | 
						|
>>> (50-5*6)/4
 | 
						|
5
 | 
						|
>>> # Integer division returns the floor:
 | 
						|
... 7/3
 | 
						|
2
 | 
						|
>>> 7/-3
 | 
						|
-3
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Like in C, the equal sign ({\tt =}) is used to assign a value to a
 | 
						|
variable.  The value of an assignment is not written:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> width = 20
 | 
						|
>>> height = 5*9
 | 
						|
>>> width * height
 | 
						|
900
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
A value can be assigned to several variables simultaneously:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> x = y = z = 0  # Zero x, y and z
 | 
						|
>>> x
 | 
						|
0
 | 
						|
>>> y
 | 
						|
0
 | 
						|
>>> z
 | 
						|
0
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
There is full support for floating point; operators with mixed type
 | 
						|
operands convert the integer operand to floating point:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> 4 * 2.5 / 3.3
 | 
						|
3.0303030303
 | 
						|
>>> 7.0 / 2
 | 
						|
3.5
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\subsection{Strings}
 | 
						|
 | 
						|
Besides numbers, Python can also manipulate strings, enclosed in
 | 
						|
single quotes or double quotes:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> 'spam eggs'
 | 
						|
'spam eggs'
 | 
						|
>>> 'doesn\'t'
 | 
						|
"doesn't"
 | 
						|
>>> "doesn't"
 | 
						|
"doesn't"
 | 
						|
>>> '"Yes," he said.'
 | 
						|
'"Yes," he said.'
 | 
						|
>>> "\"Yes,\" he said."
 | 
						|
'"Yes," he said.'
 | 
						|
>>> '"Isn\'t," she said.'
 | 
						|
'"Isn\'t," she said.'
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Strings are written the same way as they are typed for input: inside
 | 
						|
quotes and with quotes and other funny characters escaped by backslashes,
 | 
						|
to show the precise value.  The string is enclosed in double quotes if
 | 
						|
the string contains a single quote and no double quotes, else it's
 | 
						|
enclosed in single quotes.  (The {\tt print} statement, described later,
 | 
						|
can be used to write strings without quotes or escapes.)
 | 
						|
 | 
						|
Strings can be concatenated (glued together) with the {\tt +}
 | 
						|
operator, and repeated with {\tt *}:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> word = 'Help' + 'A'
 | 
						|
>>> word
 | 
						|
'HelpA'
 | 
						|
>>> '<' + word*5 + '>'
 | 
						|
'<HelpAHelpAHelpAHelpAHelpA>'
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Strings can be subscripted (indexed); like in C, the first character of
 | 
						|
a string has subscript (index) 0.
 | 
						|
 | 
						|
There is no separate character type; a character is simply a string of
 | 
						|
size one.  Like in Icon, substrings can be specified with the {\em
 | 
						|
slice} notation: two indices separated by a colon.
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> word[4]
 | 
						|
'A'
 | 
						|
>>> word[0:2]
 | 
						|
'He'
 | 
						|
>>> word[2:4]
 | 
						|
'lp'
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Slice indices have useful defaults; an omitted first index defaults to
 | 
						|
zero, an omitted second index defaults to the size of the string being
 | 
						|
sliced.
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> word[:2]    # The first two characters
 | 
						|
'He'
 | 
						|
>>> word[2:]    # All but the first two characters
 | 
						|
'lpA'
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Here's a useful invariant of slice operations: \verb\s[:i] + s[i:]\
 | 
						|
equals \verb\s\.
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> word[:2] + word[2:]
 | 
						|
'HelpA'
 | 
						|
>>> word[:3] + word[3:]
 | 
						|
'HelpA'
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Degenerate slice indices are handled gracefully: an index that is too
 | 
						|
large is replaced by the string size, an upper bound smaller than the
 | 
						|
lower bound returns an empty string.
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> word[1:100]
 | 
						|
'elpA'
 | 
						|
>>> word[10:]
 | 
						|
''
 | 
						|
>>> word[2:1]
 | 
						|
''
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Indices may be negative numbers, to start counting from the right.
 | 
						|
For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> word[-1]     # The last character
 | 
						|
'A'
 | 
						|
>>> word[-2]     # The last-but-one character
 | 
						|
'p'
 | 
						|
>>> word[-2:]    # The last two characters
 | 
						|
'pA'
 | 
						|
>>> word[:-2]    # All but the last two characters
 | 
						|
'Hel'
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
But note that -0 is really the same as 0, so it does not count from
 | 
						|
the right!
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> word[-0]     # (since -0 equals 0)
 | 
						|
'H'
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Out-of-range negative slice indices are truncated, but don't try this
 | 
						|
for single-element (non-slice) indices:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> word[-100:]
 | 
						|
'HelpA'
 | 
						|
>>> word[-10]    # error
 | 
						|
Traceback (innermost last):
 | 
						|
  File "<stdin>", line 1
 | 
						|
IndexError: string index out of range
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The best way to remember how slices work is to think of the indices as
 | 
						|
pointing {\em between} characters, with the left edge of the first
 | 
						|
character numbered 0.  Then the right edge of the last character of a
 | 
						|
string of {\tt n} characters has index {\tt n}, for example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
 +---+---+---+---+---+ 
 | 
						|
 | H | e | l | p | A |
 | 
						|
 +---+---+---+---+---+ 
 | 
						|
 0   1   2   3   4   5 
 | 
						|
-5  -4  -3  -2  -1
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The first row of numbers gives the position of the indices 0...5 in
 | 
						|
the string; the second row gives the corresponding negative indices.
 | 
						|
The slice from \verb\i\ to \verb\j\ consists of all characters between
 | 
						|
the edges labeled \verb\i\ and \verb\j\, respectively.
 | 
						|
 | 
						|
For nonnegative indices, the length of a slice is the difference of
 | 
						|
the indices, if both are within bounds, e.g., the length of
 | 
						|
\verb\word[1:3]\ is 2.
 | 
						|
 | 
						|
The built-in function {\tt len()} returns the length of a string:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> s = 'supercalifragilisticexpialidocious'
 | 
						|
>>> len(s)
 | 
						|
34
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\subsection{Lists}
 | 
						|
 | 
						|
Python knows a number of {\em compound} data types, used to group
 | 
						|
together other values.  The most versatile is the {\em list}, which
 | 
						|
can be written as a list of comma-separated values (items) between
 | 
						|
square brackets.  List items need not all have the same type.
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> a = ['spam', 'eggs', 100, 1234]
 | 
						|
>>> a
 | 
						|
['spam', 'eggs', 100, 1234]
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Like string indices, list indices start at 0, and lists can be sliced,
 | 
						|
concatenated and so on:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> a[0]
 | 
						|
'spam'
 | 
						|
>>> a[3]
 | 
						|
1234
 | 
						|
>>> a[-2]
 | 
						|
100
 | 
						|
>>> a[1:-1]
 | 
						|
['eggs', 100]
 | 
						|
>>> a[:2] + ['bacon', 2*2]
 | 
						|
['spam', 'eggs', 'bacon', 4]
 | 
						|
>>> 3*a[:3] + ['Boe!']
 | 
						|
['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boe!']
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Unlike strings, which are {\em immutable}, it is possible to change
 | 
						|
individual elements of a list:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> a
 | 
						|
['spam', 'eggs', 100, 1234]
 | 
						|
>>> a[2] = a[2] + 23
 | 
						|
>>> a
 | 
						|
['spam', 'eggs', 123, 1234]
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Assignment to slices is also possible, and this can even change the size
 | 
						|
of the list:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> # Replace some items:
 | 
						|
... a[0:2] = [1, 12]
 | 
						|
>>> a
 | 
						|
[1, 12, 123, 1234]
 | 
						|
>>> # Remove some:
 | 
						|
... a[0:2] = []
 | 
						|
>>> a
 | 
						|
[123, 1234]
 | 
						|
>>> # Insert some:
 | 
						|
... a[1:1] = ['bletch', 'xyzzy']
 | 
						|
>>> a
 | 
						|
[123, 'bletch', 'xyzzy', 1234]
 | 
						|
>>> a[:0] = a     # Insert (a copy of) itself at the beginning
 | 
						|
>>> a
 | 
						|
[123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234]
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The built-in function {\tt len()} also applies to lists:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> len(a)
 | 
						|
8
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
It is possible to nest lists (create lists containing other lists),
 | 
						|
for example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> q = [2, 3]
 | 
						|
>>> p = [1, q, 4]
 | 
						|
>>> len(p)
 | 
						|
3
 | 
						|
>>> p[1]
 | 
						|
[2, 3]
 | 
						|
>>> p[1][0]
 | 
						|
2
 | 
						|
>>> p[1].append('xtra')     # See section 5.1
 | 
						|
>>> p
 | 
						|
[1, [2, 3, 'xtra'], 4]
 | 
						|
>>> q
 | 
						|
[2, 3, 'xtra']
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Note that in the last example, {\tt p[1]} and {\tt q} really refer to
 | 
						|
the same object!  We'll come back to {\em object semantics} later.
 | 
						|
 | 
						|
\section{First Steps Towards Programming}
 | 
						|
 | 
						|
Of course, we can use Python for more complicated tasks than adding
 | 
						|
two and two together.  For instance, we can write an initial
 | 
						|
subsequence of the {\em Fibonacci} series as follows:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> # Fibonacci series:
 | 
						|
... # the sum of two elements defines the next
 | 
						|
... a, b = 0, 1
 | 
						|
>>> while b < 10:
 | 
						|
...       print b
 | 
						|
...       a, b = b, a+b
 | 
						|
... 
 | 
						|
1
 | 
						|
1
 | 
						|
2
 | 
						|
3
 | 
						|
5
 | 
						|
8
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
This example introduces several new features.
 | 
						|
 | 
						|
\begin{itemize}
 | 
						|
 | 
						|
\item
 | 
						|
The first line contains a {\em multiple assignment}: the variables
 | 
						|
{\tt a} and {\tt b} simultaneously get the new values 0 and 1.  On the
 | 
						|
last line this is used again, demonstrating that the expressions on
 | 
						|
the right-hand side are all evaluated first before any of the
 | 
						|
assignments take place.
 | 
						|
 | 
						|
\item
 | 
						|
The {\tt while} loop executes as long as the condition (here: {\tt b <
 | 
						|
10}) remains true.  In Python, like in C, any non-zero integer value is
 | 
						|
true; zero is false.  The condition may also be a string or list value,
 | 
						|
in fact any sequence; anything with a non-zero length is true, empty
 | 
						|
sequences are false.  The test used in the example is a simple
 | 
						|
comparison.  The standard comparison operators are written the same as
 | 
						|
in C: {\tt <}, {\tt >}, {\tt ==}, {\tt <=}, {\tt >=} and {\tt !=}.
 | 
						|
 | 
						|
\item
 | 
						|
The {\em body} of the loop is {\em indented}: indentation is Python's
 | 
						|
way of grouping statements.  Python does not (yet!) provide an
 | 
						|
intelligent input line editing facility, so you have to type a tab or
 | 
						|
space(s) for each indented line.  In practice you will prepare more
 | 
						|
complicated input for Python with a text editor; most text editors have
 | 
						|
an auto-indent facility.  When a compound statement is entered
 | 
						|
interactively, it must be followed by a blank line to indicate
 | 
						|
completion (since the parser cannot guess when you have typed the last
 | 
						|
line).
 | 
						|
 | 
						|
\item
 | 
						|
The {\tt print} statement writes the value of the expression(s) it is
 | 
						|
given.  It differs from just writing the expression you want to write
 | 
						|
(as we did earlier in the calculator examples) in the way it handles
 | 
						|
multiple expressions and strings.  Strings are printed without quotes,
 | 
						|
and a space is inserted between items, so you can format things nicely,
 | 
						|
like this:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> i = 256*256
 | 
						|
>>> print 'The value of i is', i
 | 
						|
The value of i is 65536
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
A trailing comma avoids the newline after the output:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> a, b = 0, 1
 | 
						|
>>> while b < 1000:
 | 
						|
...     print b,
 | 
						|
...     a, b = b, a+b
 | 
						|
... 
 | 
						|
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Note that the interpreter inserts a newline before it prints the next
 | 
						|
prompt if the last line was not completed.
 | 
						|
 | 
						|
\end{itemize}
 | 
						|
 | 
						|
 | 
						|
\chapter{More Control Flow Tools}
 | 
						|
 | 
						|
Besides the {\tt while} statement just introduced, Python knows the
 | 
						|
usual control flow statements known from other languages, with some
 | 
						|
twists.
 | 
						|
 | 
						|
\section{If Statements}
 | 
						|
 | 
						|
Perhaps the most well-known statement type is the {\tt if} statement.
 | 
						|
For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> if x < 0:
 | 
						|
...      x = 0
 | 
						|
...      print 'Negative changed to zero'
 | 
						|
... elif x == 0:
 | 
						|
...      print 'Zero'
 | 
						|
... elif x == 1:
 | 
						|
...      print 'Single'
 | 
						|
... else:
 | 
						|
...      print 'More'
 | 
						|
... 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
There can be zero or more {\tt elif} parts, and the {\tt else} part is
 | 
						|
optional.  The keyword `{\tt elif}' is short for `{\tt else if}', and is
 | 
						|
useful to avoid excessive indentation.  An {\tt if...elif...elif...}
 | 
						|
sequence is a substitute for the {\em switch} or {\em case} statements
 | 
						|
found in other languages.
 | 
						|
 | 
						|
\section{For Statements}
 | 
						|
 | 
						|
The {\tt for} statement in Python differs a bit from what you may be
 | 
						|
used to in C or Pascal.  Rather than always iterating over an
 | 
						|
arithmetic progression of numbers (like in Pascal), or leaving the user
 | 
						|
completely free in the iteration test and step (as C), Python's {\tt
 | 
						|
for} statement iterates over the items of any sequence (e.g., a list
 | 
						|
or a string), in the order that they appear in the sequence.  For
 | 
						|
example (no pun intended):
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> # Measure some strings:
 | 
						|
... a = ['cat', 'window', 'defenestrate']
 | 
						|
>>> for x in a:
 | 
						|
...     print x, len(x)
 | 
						|
... 
 | 
						|
cat 3
 | 
						|
window 6
 | 
						|
defenestrate 12
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
It is not safe to modify the sequence being iterated over in the loop
 | 
						|
(this can only happen for mutable sequence types, i.e., lists).  If
 | 
						|
you need to modify the list you are iterating over, e.g., duplicate
 | 
						|
selected items, you must iterate over a copy.  The slice notation
 | 
						|
makes this particularly convenient:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> for x in a[:]: # make a slice copy of the entire list
 | 
						|
...    if len(x) > 6: a.insert(0, x)
 | 
						|
... 
 | 
						|
>>> a
 | 
						|
['defenestrate', 'cat', 'window', 'defenestrate']
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{The {\tt range()} Function}
 | 
						|
 | 
						|
If you do need to iterate over a sequence of numbers, the built-in
 | 
						|
function {\tt range()} comes in handy.  It generates lists containing
 | 
						|
arithmetic progressions, e.g.:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> range(10)
 | 
						|
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The given end point is never part of the generated list; {\tt range(10)}
 | 
						|
generates a list of 10 values, exactly the legal indices for items of a
 | 
						|
sequence of length 10.  It is possible to let the range start at another
 | 
						|
number, or to specify a different increment (even negative):
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> range(5, 10)
 | 
						|
[5, 6, 7, 8, 9]
 | 
						|
>>> range(0, 10, 3)
 | 
						|
[0, 3, 6, 9]
 | 
						|
>>> range(-10, -100, -30)
 | 
						|
[-10, -40, -70]
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
To iterate over the indices of a sequence, combine {\tt range()} and
 | 
						|
{\tt len()} as follows:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> a = ['Mary', 'had', 'a', 'little', 'lamb']
 | 
						|
>>> for i in range(len(a)):
 | 
						|
...     print i, a[i]
 | 
						|
... 
 | 
						|
0 Mary
 | 
						|
1 had
 | 
						|
2 a
 | 
						|
3 little
 | 
						|
4 lamb
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{Break and Continue Statements, and Else Clauses on Loops}
 | 
						|
 | 
						|
The {\tt break} statement, like in C, breaks out of the smallest
 | 
						|
enclosing {\tt for} or {\tt while} loop.
 | 
						|
 | 
						|
The {\tt continue} statement, also borrowed from C, continues with the
 | 
						|
next iteration of the loop.
 | 
						|
 | 
						|
Loop statements may have an {\tt else} clause; it is executed when the
 | 
						|
loop terminates through exhaustion of the list (with {\tt for}) or when
 | 
						|
the condition becomes false (with {\tt while}), but not when the loop is
 | 
						|
terminated by a {\tt break} statement.  This is exemplified by the
 | 
						|
following loop, which searches for prime numbers:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> for n in range(2, 10):
 | 
						|
...     for x in range(2, n):
 | 
						|
...         if n % x == 0:
 | 
						|
...            print n, 'equals', x, '*', n/x
 | 
						|
...            break
 | 
						|
...     else:
 | 
						|
...          print n, 'is a prime number'
 | 
						|
... 
 | 
						|
2 is a prime number
 | 
						|
3 is a prime number
 | 
						|
4 equals 2 * 2
 | 
						|
5 is a prime number
 | 
						|
6 equals 2 * 3
 | 
						|
7 is a prime number
 | 
						|
8 equals 2 * 4
 | 
						|
9 equals 3 * 3
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{Pass Statements}
 | 
						|
 | 
						|
The {\tt pass} statement does nothing.
 | 
						|
It can be used when a statement is required syntactically but the
 | 
						|
program requires no action.
 | 
						|
For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> while 1:
 | 
						|
...       pass # Busy-wait for keyboard interrupt
 | 
						|
... 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{Defining Functions}
 | 
						|
 | 
						|
We can create a function that writes the Fibonacci series to an
 | 
						|
arbitrary boundary:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> def fib(n):    # write Fibonacci series up to n
 | 
						|
...     a, b = 0, 1
 | 
						|
...     while b < n:
 | 
						|
...           print b,
 | 
						|
...           a, b = b, a+b
 | 
						|
... 
 | 
						|
>>> # Now call the function we just defined:
 | 
						|
... fib(2000)
 | 
						|
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The keyword {\tt def} introduces a function {\em definition}.  It must
 | 
						|
be followed by the function name and the parenthesized list of formal
 | 
						|
parameters.  The statements that form the body of the function starts at
 | 
						|
the next line, indented by a tab stop.
 | 
						|
 | 
						|
The {\em execution} of a function introduces a new symbol table used
 | 
						|
for the local variables of the function.  More precisely, all variable
 | 
						|
assignments in a function store the value in the local symbol table;
 | 
						|
whereas
 | 
						|
variable references first look in the local symbol table, then
 | 
						|
in the global symbol table, and then in the table of built-in names.
 | 
						|
Thus,
 | 
						|
global variables cannot be directly assigned a value within a
 | 
						|
function (unless named in a {\tt global} statement), although
 | 
						|
they may be referenced.
 | 
						|
 | 
						|
The actual parameters (arguments) to a function call are introduced in
 | 
						|
the local symbol table of the called function when it is called; thus,
 | 
						|
arguments are passed using {\em call\ by\ value}.%
 | 
						|
\footnote{
 | 
						|
         Actually, {\em call  by  object reference} would be a better
 | 
						|
         description, since if a mutable object is passed, the caller
 | 
						|
         will see any changes the callee makes to it (e.g., items
 | 
						|
         inserted into a list).
 | 
						|
}
 | 
						|
When a function calls another function, a new local symbol table is
 | 
						|
created for that call.
 | 
						|
 | 
						|
A function definition introduces the function name in the
 | 
						|
current
 | 
						|
symbol table.  The value
 | 
						|
of the function name
 | 
						|
has a type that is recognized by the interpreter as a user-defined
 | 
						|
function.  This value can be assigned to another name which can then
 | 
						|
also be used as a function.  This serves as a general renaming
 | 
						|
mechanism:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> fib
 | 
						|
<function object at 10042ed0>
 | 
						|
>>> f = fib
 | 
						|
>>> f(100)
 | 
						|
1 1 2 3 5 8 13 21 34 55 89
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
You might object that {\tt fib} is not a function but a procedure.  In
 | 
						|
Python, like in C, procedures are just functions that don't return a
 | 
						|
value.  In fact, technically speaking, procedures do return a value,
 | 
						|
albeit a rather boring one.  This value is called {\tt None} (it's a
 | 
						|
built-in name).  Writing the value {\tt None} is normally suppressed by
 | 
						|
the interpreter if it would be the only value written.  You can see it
 | 
						|
if you really want to:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> print fib(0)
 | 
						|
None
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
It is simple to write a function that returns a list of the numbers of
 | 
						|
the Fibonacci series, instead of printing it:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> def fib2(n): # return Fibonacci series up to n
 | 
						|
...     result = []
 | 
						|
...     a, b = 0, 1
 | 
						|
...     while b < n:
 | 
						|
...           result.append(b)    # see below
 | 
						|
...           a, b = b, a+b
 | 
						|
...     return result
 | 
						|
... 
 | 
						|
>>> f100 = fib2(100)    # call it
 | 
						|
>>> f100                # write the result
 | 
						|
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
This example, as usual, demonstrates some new Python features:
 | 
						|
 | 
						|
\begin{itemize}
 | 
						|
 | 
						|
\item
 | 
						|
The {\tt return} statement returns with a value from a function.  {\tt
 | 
						|
return} without an expression argument is used to return from the middle
 | 
						|
of a procedure (falling off the end also returns from a procedure), in
 | 
						|
which case the {\tt None} value is returned.
 | 
						|
 | 
						|
\item
 | 
						|
The statement {\tt result.append(b)} calls a {\em method} of the list
 | 
						|
object {\tt result}.  A method is a function that `belongs' to an
 | 
						|
object and is named {\tt obj.methodname}, where {\tt obj} is some
 | 
						|
object (this may be an expression), and {\tt methodname} is the name
 | 
						|
of a method that is defined by the object's type.  Different types
 | 
						|
define different methods.  Methods of different types may have the
 | 
						|
same name without causing ambiguity.  (It is possible to define your
 | 
						|
own object types and methods, using {\em classes}, as discussed later
 | 
						|
in this tutorial.)
 | 
						|
The method {\tt append} shown in the example, is defined for
 | 
						|
list objects; it adds a new element at the end of the list.  In this
 | 
						|
example
 | 
						|
it is equivalent to {\tt result = result + [b]}, but more efficient.
 | 
						|
 | 
						|
\end{itemize}
 | 
						|
 | 
						|
 | 
						|
\chapter{Odds and Ends}
 | 
						|
 | 
						|
This chapter describes some things you've learned about already in
 | 
						|
more detail, and adds some new things as well.
 | 
						|
 | 
						|
\section{More on Lists}
 | 
						|
 | 
						|
The list data type has some more methods.  Here are all of the methods
 | 
						|
of lists objects:
 | 
						|
 | 
						|
\begin{description}
 | 
						|
 | 
						|
\item[{\tt insert(i, x)}]
 | 
						|
Insert an item at a given position.  The first argument is the index of
 | 
						|
the element before which to insert, so {\tt a.insert(0, x)} inserts at
 | 
						|
the front of the list, and {\tt a.insert(len(a), x)} is equivalent to
 | 
						|
{\tt a.append(x)}.
 | 
						|
 | 
						|
\item[{\tt append(x)}]
 | 
						|
Equivalent to {\tt a.insert(len(a), x)}.
 | 
						|
 | 
						|
\item[{\tt index(x)}]
 | 
						|
Return the index in the list of the first item whose value is {\tt x}.
 | 
						|
It is an error if there is no such item.
 | 
						|
 | 
						|
\item[{\tt remove(x)}]
 | 
						|
Remove the first item from the list whose value is {\tt x}.
 | 
						|
It is an error if there is no such item.
 | 
						|
 | 
						|
\item[{\tt sort()}]
 | 
						|
Sort the items of the list, in place.
 | 
						|
 | 
						|
\item[{\tt reverse()}]
 | 
						|
Reverse the elements of the list, in place.
 | 
						|
 | 
						|
\item[{\tt count(x)}]
 | 
						|
Return the number of times {\tt x} appears in the list.
 | 
						|
 | 
						|
\end{description}
 | 
						|
 | 
						|
An example that uses all list methods:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> a = [66.6, 333, 333, 1, 1234.5]
 | 
						|
>>> print a.count(333), a.count(66.6), a.count('x')
 | 
						|
2 1 0
 | 
						|
>>> a.insert(2, -1)
 | 
						|
>>> a.append(333)
 | 
						|
>>> a
 | 
						|
[66.6, 333, -1, 333, 1, 1234.5, 333]
 | 
						|
>>> a.index(333)
 | 
						|
1
 | 
						|
>>> a.remove(333)
 | 
						|
>>> a
 | 
						|
[66.6, -1, 333, 1, 1234.5, 333]
 | 
						|
>>> a.reverse()
 | 
						|
>>> a
 | 
						|
[333, 1234.5, 1, 333, -1, 66.6]
 | 
						|
>>> a.sort()
 | 
						|
>>> a
 | 
						|
[-1, 1, 66.6, 333, 333, 1234.5]
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{The {\tt del} statement}
 | 
						|
 | 
						|
There is a way to remove an item from a list given its index instead
 | 
						|
of its value: the {\tt del} statement.  This can also be used to
 | 
						|
remove slices from a list (which we did earlier by assignment of an
 | 
						|
empty list to the slice).  For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> a
 | 
						|
[-1, 1, 66.6, 333, 333, 1234.5]
 | 
						|
>>> del a[0]
 | 
						|
>>> a
 | 
						|
[1, 66.6, 333, 333, 1234.5]
 | 
						|
>>> del a[2:4]
 | 
						|
>>> a
 | 
						|
[1, 66.6, 1234.5]
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
{\tt del} can also be used to delete entire variables:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> del a
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Referencing the name {\tt a} hereafter is an error (at least until
 | 
						|
another value is assigned to it).  We'll find other uses for {\tt del}
 | 
						|
later.
 | 
						|
 | 
						|
\section{Tuples and Sequences}
 | 
						|
 | 
						|
We saw that lists and strings have many common properties, e.g.,
 | 
						|
indexing and slicing operations.  They are two examples of {\em
 | 
						|
sequence} data types.  Since Python is an evolving language, other
 | 
						|
sequence data types may be added.  There is also another standard
 | 
						|
sequence data type: the {\em tuple}.
 | 
						|
 | 
						|
A tuple consists of a number of values separated by commas, for
 | 
						|
instance:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> t = 12345, 54321, 'hello!'
 | 
						|
>>> t[0]
 | 
						|
12345
 | 
						|
>>> t
 | 
						|
(12345, 54321, 'hello!')
 | 
						|
>>> # Tuples may be nested:
 | 
						|
... u = t, (1, 2, 3, 4, 5)
 | 
						|
>>> u
 | 
						|
((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
As you see, on output tuples are alway enclosed in parentheses, so
 | 
						|
that nested tuples are interpreted correctly; they may be input with
 | 
						|
or without surrounding parentheses, although often parentheses are
 | 
						|
necessary anyway (if the tuple is part of a larger expression).
 | 
						|
 | 
						|
Tuples have many uses, e.g., (x, y) coordinate pairs, employee records
 | 
						|
from a database, etc.  Tuples, like strings, are immutable: it is not
 | 
						|
possible to assign to the individual items of a tuple (you can
 | 
						|
simulate much of the same effect with slicing and concatenation,
 | 
						|
though).
 | 
						|
 | 
						|
A special problem is the construction of tuples containing 0 or 1
 | 
						|
items: the syntax has some extra quirks to accommodate these.  Empty
 | 
						|
tuples are constructed by an empty pair of parentheses; a tuple with
 | 
						|
one item is constructed by following a value with a comma
 | 
						|
(it is not sufficient to enclose a single value in parentheses).
 | 
						|
Ugly, but effective.  For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> empty = ()
 | 
						|
>>> singleton = 'hello',    # <-- note trailing comma
 | 
						|
>>> len(empty)
 | 
						|
0
 | 
						|
>>> len(singleton)
 | 
						|
1
 | 
						|
>>> singleton
 | 
						|
('hello',)
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The statement {\tt t = 12345, 54321, 'hello!'} is an example of {\em
 | 
						|
tuple packing}: the values {\tt 12345}, {\tt 54321} and {\tt 'hello!'}
 | 
						|
are packed together in a tuple.  The reverse operation is also
 | 
						|
possible, e.g.:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> x, y, z = t
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
This is called, appropriately enough, {\em tuple unpacking}.  Tuple
 | 
						|
unpacking requires that the list of variables on the left has the same
 | 
						|
number of elements as the length of the tuple.  Note that multiple
 | 
						|
assignment is really just a combination of tuple packing and tuple
 | 
						|
unpacking!
 | 
						|
 | 
						|
Occasionally, the corresponding operation on lists is useful: {\em list
 | 
						|
unpacking}.  This is supported by enclosing the list of variables in
 | 
						|
square brackets:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> a = ['spam', 'eggs', 100, 1234]
 | 
						|
>>> [a1, a2, a3, a4] = a
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{Dictionaries}
 | 
						|
 | 
						|
Another useful data type built into Python is the {\em dictionary}.
 | 
						|
Dictionaries are sometimes found in other languages as ``associative
 | 
						|
memories'' or ``associative arrays''.  Unlike sequences, which are
 | 
						|
indexed by a range of numbers, dictionaries are indexed by {\em keys},
 | 
						|
which are strings (the use of non-string values as keys
 | 
						|
is supported, but beyond the scope of this tutorial).
 | 
						|
It is best to think of a dictionary as an unordered set of
 | 
						|
{\em key:value} pairs, with the requirement that the keys are unique
 | 
						|
(within one dictionary).
 | 
						|
A pair of braces creates an empty dictionary: \verb/{}/.
 | 
						|
Placing a comma-separated list of key:value pairs within the
 | 
						|
braces adds initial key:value pairs to the dictionary; this is also the
 | 
						|
way dictionaries are written on output.
 | 
						|
 | 
						|
The main operations on a dictionary are storing a value with some key
 | 
						|
and extracting the value given the key.  It is also possible to delete
 | 
						|
a key:value pair
 | 
						|
with {\tt del}.
 | 
						|
If you store using a key that is already in use, the old value
 | 
						|
associated with that key is forgotten.  It is an error to extract a
 | 
						|
value using a non-existent key.
 | 
						|
 | 
						|
The {\tt keys()} method of a dictionary object returns a list of all the
 | 
						|
keys used in the dictionary, in random order (if you want it sorted,
 | 
						|
just apply the {\tt sort()} method to the list of keys).  To check
 | 
						|
whether a single key is in the dictionary, use the \verb/has_key()/
 | 
						|
method of the dictionary.
 | 
						|
 | 
						|
Here is a small example using a dictionary:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> tel = {'jack': 4098, 'sape': 4139}
 | 
						|
>>> tel['guido'] = 4127
 | 
						|
>>> tel
 | 
						|
{'sape': 4139, 'guido': 4127, 'jack': 4098}
 | 
						|
>>> tel['jack']
 | 
						|
4098
 | 
						|
>>> del tel['sape']
 | 
						|
>>> tel['irv'] = 4127
 | 
						|
>>> tel
 | 
						|
{'guido': 4127, 'irv': 4127, 'jack': 4098}
 | 
						|
>>> tel.keys()
 | 
						|
['guido', 'irv', 'jack']
 | 
						|
>>> tel.has_key('guido')
 | 
						|
1
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{More on Conditions}
 | 
						|
 | 
						|
The conditions used in {\tt while} and {\tt if} statements above can
 | 
						|
contain other operators besides comparisons.
 | 
						|
 | 
						|
The comparison operators {\tt in} and {\tt not in} check whether a value
 | 
						|
occurs (does not occur) in a sequence.  The operators {\tt is} and {\tt
 | 
						|
is not} compare whether two objects are really the same object; this
 | 
						|
only matters for mutable objects like lists.  All comparison operators
 | 
						|
have the same priority, which is lower than that of all numerical
 | 
						|
operators.
 | 
						|
 | 
						|
Comparisons can be chained: e.g., {\tt a < b == c} tests whether {\tt a}
 | 
						|
is less than {\tt b} and moreover {\tt b} equals {\tt c}.
 | 
						|
 | 
						|
Comparisons may be combined by the Boolean operators {\tt and} and {\tt
 | 
						|
or}, and the outcome of a comparison (or of any other Boolean
 | 
						|
expression) may be negated with {\tt not}.  These all have lower
 | 
						|
priorities than comparison operators again; between them, {\tt not} has
 | 
						|
the highest priority, and {\tt or} the lowest, so that
 | 
						|
{\tt A and not B or C} is equivalent to {\tt (A and (not B)) or C}.  Of
 | 
						|
course, parentheses can be used to express the desired composition.
 | 
						|
 | 
						|
The Boolean operators {\tt and} and {\tt or} are so-called {\em
 | 
						|
shortcut} operators: their arguments are evaluated from left to right,
 | 
						|
and evaluation stops as soon as the outcome is determined.  E.g., if
 | 
						|
{\tt A} and {\tt C} are true but {\tt B} is false, {\tt A and B and C}
 | 
						|
does not evaluate the expression C.  In general, the return value of a
 | 
						|
shortcut operator, when used as a general value and not as a Boolean, is
 | 
						|
the last evaluated argument.
 | 
						|
 | 
						|
It is possible to assign the result of a comparison or other Boolean
 | 
						|
expression to a variable.  For example,
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
 | 
						|
>>> non_null = string1 or string2 or string3
 | 
						|
>>> non_null
 | 
						|
'Trondheim'
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Note that in Python, unlike C, assignment cannot occur inside expressions.
 | 
						|
 | 
						|
\section{Comparing Sequences and Other Types}
 | 
						|
 | 
						|
Sequence objects may be compared to other objects with the same
 | 
						|
sequence type.  The comparison uses {\em lexicographical} ordering:
 | 
						|
first the first two items are compared, and if they differ this
 | 
						|
determines the outcome of the comparison; if they are equal, the next
 | 
						|
two items are compared, and so on, until either sequence is exhausted.
 | 
						|
If two items to be compared are themselves sequences of the same type,
 | 
						|
the lexicographical comparison is carried out recursively.  If all
 | 
						|
items of two sequences compare equal, the sequences are considered
 | 
						|
equal.  If one sequence is an initial subsequence of the other, the
 | 
						|
shorted sequence is the smaller one.  Lexicographical ordering for
 | 
						|
strings uses the ASCII ordering for individual characters.  Some
 | 
						|
examples of comparisons between sequences with the same types:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
(1, 2, 3)              < (1, 2, 4)
 | 
						|
[1, 2, 3]              < [1, 2, 4]
 | 
						|
'ABC' < 'C' < 'Pascal' < 'Python'
 | 
						|
(1, 2, 3, 4)           < (1, 2, 4)
 | 
						|
(1, 2)                 < (1, 2, -1)
 | 
						|
(1, 2, 3)              = (1.0, 2.0, 3.0)
 | 
						|
(1, 2, ('aa', 'ab'))   < (1, 2, ('abc', 'a'), 4)
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Note that comparing objects of different types is legal.  The outcome
 | 
						|
is deterministic but arbitrary: the types are ordered by their name.
 | 
						|
Thus, a list is always smaller than a string, a string is always
 | 
						|
smaller than a tuple, etc.  Mixed numeric types are compared according
 | 
						|
to their numeric value, so 0 equals 0.0, etc.%
 | 
						|
\footnote{
 | 
						|
        The rules for comparing objects of different types should
 | 
						|
        not be relied upon; they may change in a future version of
 | 
						|
        the language.
 | 
						|
}
 | 
						|
 | 
						|
 | 
						|
\chapter{Modules}
 | 
						|
 | 
						|
If you quit from the Python interpreter and enter it again, the
 | 
						|
definitions you have made (functions and variables) are lost.
 | 
						|
Therefore, if you want to write a somewhat longer program, you are
 | 
						|
better off using a text editor to prepare the input for the interpreter
 | 
						|
and running it with that file as input instead.  This is known as creating a
 | 
						|
{\em script}.  As your program gets longer, you may want to split it
 | 
						|
into several files for easier maintenance.  You may also want to use a
 | 
						|
handy function that you've written in several programs without copying
 | 
						|
its definition into each program.
 | 
						|
 | 
						|
To support this, Python has a way to put definitions in a file and use
 | 
						|
them in a script or in an interactive instance of the interpreter.
 | 
						|
Such a file is called a {\em module}; definitions from a module can be
 | 
						|
{\em imported} into other modules or into the {\em main} module (the
 | 
						|
collection of variables that you have access to in a script
 | 
						|
executed at the top level
 | 
						|
and in calculator mode).
 | 
						|
 | 
						|
A module is a file containing Python definitions and statements.  The
 | 
						|
file name is the module name with the suffix {\tt .py} appended.  Within
 | 
						|
a module, the module's name (as a string) is available as the value of
 | 
						|
the global variable {\tt __name__}.  For instance, use your favorite text
 | 
						|
editor to create a file called {\tt fibo.py} in the current directory
 | 
						|
with the following contents:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
# Fibonacci numbers module
 | 
						|
 | 
						|
def fib(n):    # write Fibonacci series up to n
 | 
						|
    a, b = 0, 1
 | 
						|
    while b < n:
 | 
						|
          print b,
 | 
						|
          a, b = b, a+b
 | 
						|
 | 
						|
def fib2(n): # return Fibonacci series up to n
 | 
						|
    result = []
 | 
						|
    a, b = 0, 1
 | 
						|
    while b < n:
 | 
						|
          result.append(b)
 | 
						|
          a, b = b, a+b
 | 
						|
    return result
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Now enter the Python interpreter and import this module with the
 | 
						|
following command:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> import fibo
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
This does not enter the names of the functions defined in
 | 
						|
{\tt fibo}
 | 
						|
directly in the current symbol table; it only enters the module name
 | 
						|
{\tt fibo}
 | 
						|
there.
 | 
						|
Using the module name you can access the functions:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> fibo.fib(1000)
 | 
						|
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
 | 
						|
>>> fibo.fib2(100)
 | 
						|
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
 | 
						|
>>> fibo.__name__
 | 
						|
'fibo'
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
If you intend to use a function often you can assign it to a local name:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> fib = fibo.fib
 | 
						|
>>> fib(500)
 | 
						|
1 1 2 3 5 8 13 21 34 55 89 144 233 377
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{More on Modules}
 | 
						|
 | 
						|
A module can contain executable statements as well as function
 | 
						|
definitions.
 | 
						|
These statements are intended to initialize the module.
 | 
						|
They are executed only the
 | 
						|
{\em first}
 | 
						|
time the module is imported somewhere.%
 | 
						|
\footnote{
 | 
						|
        In fact function definitions are also `statements' that are
 | 
						|
        `executed'; the execution enters the function name in the
 | 
						|
        module's global symbol table.
 | 
						|
}
 | 
						|
 | 
						|
Each module has its own private symbol table, which is used as the
 | 
						|
global symbol table by all functions defined in the module.
 | 
						|
Thus, the author of a module can use global variables in the module
 | 
						|
without worrying about accidental clashes with a user's global
 | 
						|
variables.
 | 
						|
On the other hand, if you know what you are doing you can touch a
 | 
						|
module's global variables with the same notation used to refer to its
 | 
						|
functions,
 | 
						|
{\tt modname.itemname}.
 | 
						|
 | 
						|
Modules can import other modules.
 | 
						|
It is customary but not required to place all
 | 
						|
{\tt import}
 | 
						|
statements at the beginning of a module (or script, for that matter).
 | 
						|
The imported module names are placed in the importing module's global
 | 
						|
symbol table.
 | 
						|
 | 
						|
There is a variant of the
 | 
						|
{\tt import}
 | 
						|
statement that imports names from a module directly into the importing
 | 
						|
module's symbol table.
 | 
						|
For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> from fibo import fib, fib2
 | 
						|
>>> fib(500)
 | 
						|
1 1 2 3 5 8 13 21 34 55 89 144 233 377
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
This does not introduce the module name from which the imports are taken
 | 
						|
in the local symbol table (so in the example, {\tt fibo} is not
 | 
						|
defined).
 | 
						|
 | 
						|
There is even a variant to import all names that a module defines:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> from fibo import *
 | 
						|
>>> fib(500)
 | 
						|
1 1 2 3 5 8 13 21 34 55 89 144 233 377
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
This imports all names except those beginning with an underscore
 | 
						|
({\tt _}).
 | 
						|
 | 
						|
\section{Standard Modules}
 | 
						|
 | 
						|
Python comes with a library of standard modules, described in a separate
 | 
						|
document (Python Library Reference).  Some modules are built into the
 | 
						|
interpreter; these provide access to operations that are not part of the
 | 
						|
core of the language but are nevertheless built in, either for
 | 
						|
efficiency or to provide access to operating system primitives such as
 | 
						|
system calls.  The set of such modules is a configuration option; e.g.,
 | 
						|
the {\tt amoeba} module is only provided on systems that somehow support
 | 
						|
Amoeba primitives.  One particular module deserves some attention: {\tt
 | 
						|
sys}, which is built into every Python interpreter.  The variables {\tt
 | 
						|
sys.ps1} and {\tt sys.ps2} define the strings used as primary and
 | 
						|
secondary prompts:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> import sys
 | 
						|
>>> sys.ps1
 | 
						|
'>>> '
 | 
						|
>>> sys.ps2
 | 
						|
'... '
 | 
						|
>>> sys.ps1 = 'C> '
 | 
						|
C> print 'Yuck!'
 | 
						|
Yuck!
 | 
						|
C> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
These two variables are only defined if the interpreter is in
 | 
						|
interactive mode.
 | 
						|
 | 
						|
The variable
 | 
						|
{\tt sys.path}
 | 
						|
is a list of strings that determine the interpreter's search path for
 | 
						|
modules.
 | 
						|
It is initialized to a default path taken from the environment variable
 | 
						|
{\tt PYTHONPATH},
 | 
						|
or from a built-in default if
 | 
						|
{\tt PYTHONPATH}
 | 
						|
is not set.
 | 
						|
You can modify it using standard list operations, e.g.:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> import sys
 | 
						|
>>> sys.path.append('/ufs/guido/lib/python')
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{The {\tt dir()} function}
 | 
						|
 | 
						|
The built-in function {\tt dir} is used to find out which names a module
 | 
						|
defines.  It returns a sorted list of strings:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> import fibo, sys
 | 
						|
>>> dir(fibo)
 | 
						|
['__name__', 'fib', 'fib2']
 | 
						|
>>> dir(sys)
 | 
						|
['__name__', 'argv', 'builtin_module_names', 'copyright', 'exit',
 | 
						|
'maxint', 'modules', 'path', 'ps1', 'ps2', 'setprofile', 'settrace',
 | 
						|
'stderr', 'stdin', 'stdout', 'version']
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Without arguments, {\tt dir()} lists the names you have defined currently:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> a = [1, 2, 3, 4, 5]
 | 
						|
>>> import fibo, sys
 | 
						|
>>> fib = fibo.fib
 | 
						|
>>> dir()
 | 
						|
['__name__', 'a', 'fib', 'fibo', 'sys']
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Note that it lists all types of names: variables, modules, functions, etc.
 | 
						|
 | 
						|
{\tt dir()} does not list the names of built-in functions and variables.
 | 
						|
If you want a list of those, they are defined in the standard module
 | 
						|
{\tt __builtin__}:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> import __builtin__
 | 
						|
>>> dir(__builtin__)
 | 
						|
['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError',
 | 
						|
'ImportError', 'IndexError', 'KeyError', 'KeyboardInterrupt',
 | 
						|
'MemoryError', 'NameError', 'None', 'OverflowError', 'RuntimeError',
 | 
						|
'SyntaxError', 'SystemError', 'SystemExit', 'TypeError', 'ValueError',
 | 
						|
'ZeroDivisionError', '__name__', 'abs', 'apply', 'chr', 'cmp', 'coerce',
 | 
						|
'compile', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'float',
 | 
						|
'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'len', 'long',
 | 
						|
'map', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input',
 | 
						|
'reduce', 'reload', 'repr', 'round', 'setattr', 'str', 'type', 'xrange']
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
 | 
						|
\chapter{Output Formatting}
 | 
						|
 | 
						|
So far we've encountered two ways of writing values: {\em expression
 | 
						|
statements} and the {\tt print} statement.  (A third way is using the
 | 
						|
{\tt write} method of file objects; the standard output file can be
 | 
						|
referenced as {\tt sys.stdout}.  See the Library Reference for more
 | 
						|
information on this.)
 | 
						|
 | 
						|
Often you'll want more control over the formatting of your output than
 | 
						|
simply printing space-separated values.  The key to nice formatting in
 | 
						|
Python is to do all the string handling yourself; using string slicing
 | 
						|
and concatenation operations you can create any lay-out you can imagine.
 | 
						|
The standard module {\tt string} contains some useful operations for
 | 
						|
padding strings to a given column width; these will be discussed shortly.
 | 
						|
Finally, the \code{\%} operator (modulo) with a string left argument
 | 
						|
interprets this string as a C sprintf format string to be applied to the
 | 
						|
right argument, and returns the string resulting from this formatting
 | 
						|
operation.
 | 
						|
 | 
						|
One question remains, of course: how do you convert values to strings?
 | 
						|
Luckily, Python has a way to convert any value to a string: just write
 | 
						|
the value between reverse quotes (\verb/``/).  Some examples:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> x = 10 * 3.14
 | 
						|
>>> y = 200*200
 | 
						|
>>> s = 'The value of x is ' + `x` + ', and y is ' + `y` + '...'
 | 
						|
>>> print s
 | 
						|
The value of x is 31.4, and y is 40000...
 | 
						|
>>> # Reverse quotes work on other types besides numbers:
 | 
						|
... p = [x, y]
 | 
						|
>>> ps = `p`
 | 
						|
>>> ps
 | 
						|
'[31.4, 40000]'
 | 
						|
>>> # Converting a string adds string quotes and backslashes:
 | 
						|
... hello = 'hello, world\n'
 | 
						|
>>> hellos = `hello`
 | 
						|
>>> print hellos
 | 
						|
'hello, world\012'
 | 
						|
>>> # The argument of reverse quotes may be a tuple:
 | 
						|
... `x, y, ('spam', 'eggs')`
 | 
						|
"(31.4, 40000, ('spam', 'eggs'))"
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Here are two ways to write a table of squares and cubes:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> import string
 | 
						|
>>> for x in range(1, 11):
 | 
						|
...     print string.rjust(`x`, 2), string.rjust(`x*x`, 3),
 | 
						|
...     # Note trailing comma on previous line
 | 
						|
...     print string.rjust(`x*x*x`, 4)
 | 
						|
...
 | 
						|
 1   1    1
 | 
						|
 2   4    8
 | 
						|
 3   9   27
 | 
						|
 4  16   64
 | 
						|
 5  25  125
 | 
						|
 6  36  216
 | 
						|
 7  49  343
 | 
						|
 8  64  512
 | 
						|
 9  81  729
 | 
						|
10 100 1000
 | 
						|
>>> for x in range(1,11):
 | 
						|
...     print '%2d %3d %4d' % (x, x*x, x*x*x)
 | 
						|
... 
 | 
						|
 1   1    1
 | 
						|
 2   4    8
 | 
						|
 3   9   27
 | 
						|
 4  16   64
 | 
						|
 5  25  125
 | 
						|
 6  36  216
 | 
						|
 7  49  343
 | 
						|
 8  64  512
 | 
						|
 9  81  729
 | 
						|
10 100 1000
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
(Note that one space between each column was added by the way {\tt print}
 | 
						|
works: it always adds spaces between its arguments.)
 | 
						|
 | 
						|
This example demonstrates the function {\tt string.rjust()}, which
 | 
						|
right-justifies a string in a field of a given width by padding it with
 | 
						|
spaces on the left.  There are similar functions {\tt string.ljust()}
 | 
						|
and {\tt string.center()}.  These functions do not write anything, they
 | 
						|
just return a new string.  If the input string is too long, they don't
 | 
						|
truncate it, but return it unchanged; this will mess up your column
 | 
						|
lay-out but that's usually better than the alternative, which would be
 | 
						|
lying about a value.  (If you really want truncation you can always add
 | 
						|
a slice operation, as in {\tt string.ljust(x,~n)[0:n]}.)
 | 
						|
 | 
						|
There is another function, {\tt string.zfill}, which pads a numeric
 | 
						|
string on the left with zeros.  It understands about plus and minus
 | 
						|
signs:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> string.zfill('12', 5)
 | 
						|
'00012'
 | 
						|
>>> string.zfill('-3.14', 7)
 | 
						|
'-003.14'
 | 
						|
>>> string.zfill('3.14159265359', 5)
 | 
						|
'3.14159265359'
 | 
						|
>>>
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
 | 
						|
\chapter{Errors and Exceptions}
 | 
						|
 | 
						|
Until now error messages haven't been more than mentioned, but if you
 | 
						|
have tried out the examples you have probably seen some.  There are
 | 
						|
(at least) two distinguishable kinds of errors: {\em syntax\ errors}
 | 
						|
and {\em exceptions}.
 | 
						|
 | 
						|
\section{Syntax Errors}
 | 
						|
 | 
						|
Syntax errors, also known as parsing errors, are perhaps the most common
 | 
						|
kind of complaint you get while you are still learning Python:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> while 1 print 'Hello world'
 | 
						|
  File "<stdin>", line 1
 | 
						|
    while 1 print 'Hello world'
 | 
						|
                ^
 | 
						|
SyntaxError: invalid syntax
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The parser repeats the offending line and displays a little `arrow'
 | 
						|
pointing at the earliest point in the line where the error was detected.
 | 
						|
The error is caused by (or at least detected at) the token
 | 
						|
{\em preceding}
 | 
						|
the arrow: in the example, the error is detected at the keyword
 | 
						|
{\tt print}, since a colon ({\tt :}) is missing before it.
 | 
						|
File name and line number are printed so you know where to look in case
 | 
						|
the input came from a script.
 | 
						|
 | 
						|
\section{Exceptions}
 | 
						|
 | 
						|
Even if a statement or expression is syntactically correct, it may
 | 
						|
cause an error when an attempt is made to execute it.
 | 
						|
Errors detected during execution are called {\em exceptions} and are
 | 
						|
not unconditionally fatal: you will soon learn how to handle them in
 | 
						|
Python programs.  Most exceptions are not handled by programs,
 | 
						|
however, and result in error messages as shown here:
 | 
						|
 | 
						|
\bcode\small\begin{verbatim}
 | 
						|
>>> 10 * (1/0)
 | 
						|
Traceback (innermost last):
 | 
						|
  File "<stdin>", line 1
 | 
						|
ZeroDivisionError: integer division or modulo
 | 
						|
>>> 4 + spam*3
 | 
						|
Traceback (innermost last):
 | 
						|
  File "<stdin>", line 1
 | 
						|
NameError: spam
 | 
						|
>>> '2' + 2
 | 
						|
Traceback (innermost last):
 | 
						|
  File "<stdin>", line 1
 | 
						|
TypeError: illegal argument type for built-in operation
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The last line of the error message indicates what happened.
 | 
						|
Exceptions come in different types, and the type is printed as part of
 | 
						|
the message: the types in the example are
 | 
						|
{\tt ZeroDivisionError},
 | 
						|
{\tt NameError}
 | 
						|
and
 | 
						|
{\tt TypeError}.
 | 
						|
The string printed as the exception type is the name of the built-in
 | 
						|
name for the exception that occurred.  This is true for all built-in
 | 
						|
exceptions, but need not be true for user-defined exceptions (although
 | 
						|
it is a useful convention).
 | 
						|
Standard exception names are built-in identifiers (not reserved
 | 
						|
keywords).
 | 
						|
 | 
						|
The rest of the line is a detail whose interpretation depends on the
 | 
						|
exception type; its meaning is dependent on the exception type.
 | 
						|
 | 
						|
The preceding part of the error message shows the context where the
 | 
						|
exception happened, in the form of a stack backtrace.
 | 
						|
In general it contains a stack backtrace listing source lines; however,
 | 
						|
it will not display lines read from standard input.
 | 
						|
 | 
						|
The Python library reference manual lists the built-in exceptions and
 | 
						|
their meanings.
 | 
						|
 | 
						|
\section{Handling Exceptions}
 | 
						|
 | 
						|
It is possible to write programs that handle selected exceptions.
 | 
						|
Look at the following example, which prints a table of inverses of
 | 
						|
some floating point numbers:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> numbers = [0.3333, 2.5, 0, 10]
 | 
						|
>>> for x in numbers:
 | 
						|
...     print x,
 | 
						|
...     try:
 | 
						|
...         print 1.0 / x
 | 
						|
...     except ZeroDivisionError:
 | 
						|
...         print '*** has no inverse ***'
 | 
						|
... 
 | 
						|
0.3333 3.00030003
 | 
						|
2.5 0.4
 | 
						|
0 *** has no inverse ***
 | 
						|
10 0.1
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The {\tt try} statement works as follows.
 | 
						|
\begin{itemize}
 | 
						|
\item
 | 
						|
First, the
 | 
						|
{\em try\ clause}
 | 
						|
(the statement(s) between the {\tt try} and {\tt except} keywords) is
 | 
						|
executed.
 | 
						|
\item
 | 
						|
If no exception occurs, the
 | 
						|
{\em except\ clause}
 | 
						|
is skipped and execution of the {\tt try} statement is finished.
 | 
						|
\item
 | 
						|
If an exception occurs during execution of the try clause,
 | 
						|
the rest of the clause is skipped.  Then if
 | 
						|
its type matches the exception named after the {\tt except} keyword,
 | 
						|
the rest of the try clause is skipped, the except clause is executed,
 | 
						|
and then execution continues after the {\tt try} statement.
 | 
						|
\item
 | 
						|
If an exception occurs which does not match the exception named in the
 | 
						|
except clause, it is passed on to outer try statements; if no handler is
 | 
						|
found, it is an
 | 
						|
{\em unhandled\ exception}
 | 
						|
and execution stops with a message as shown above.
 | 
						|
\end{itemize}
 | 
						|
A {\tt try} statement may have more than one except clause, to specify
 | 
						|
handlers for different exceptions.
 | 
						|
At most one handler will be executed.
 | 
						|
Handlers only handle exceptions that occur in the corresponding try
 | 
						|
clause, not in other handlers of the same {\tt try} statement.
 | 
						|
An except clause may name multiple exceptions as a parenthesized list,
 | 
						|
e.g.:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
... except (RuntimeError, TypeError, NameError):
 | 
						|
...     pass
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The last except clause may omit the exception name(s), to serve as a
 | 
						|
wildcard.
 | 
						|
Use this with extreme caution, since it is easy to mask a real
 | 
						|
programming error in this way!
 | 
						|
 | 
						|
When an exception occurs, it may have an associated value, also known as
 | 
						|
the exceptions's
 | 
						|
{\em argument}.
 | 
						|
The presence and type of the argument depend on the exception type.
 | 
						|
For exception types which have an argument, the except clause may
 | 
						|
specify a variable after the exception name (or list) to receive the
 | 
						|
argument's value, as follows:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> try:
 | 
						|
...     spam()
 | 
						|
... except NameError, x:
 | 
						|
...     print 'name', x, 'undefined'
 | 
						|
... 
 | 
						|
name spam undefined
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
If an exception has an argument, it is printed as the last part
 | 
						|
(`detail') of the message for unhandled exceptions.
 | 
						|
 | 
						|
Exception handlers don't just handle exceptions if they occur
 | 
						|
immediately in the try clause, but also if they occur inside functions
 | 
						|
that are called (even indirectly) in the try clause.
 | 
						|
For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> def this_fails():
 | 
						|
...     x = 1/0
 | 
						|
... 
 | 
						|
>>> try:
 | 
						|
...     this_fails()
 | 
						|
... except ZeroDivisionError, detail:
 | 
						|
...     print 'Handling run-time error:', detail
 | 
						|
... 
 | 
						|
Handling run-time error: integer division or modulo
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
 | 
						|
\section{Raising Exceptions}
 | 
						|
 | 
						|
The {\tt raise} statement allows the programmer to force a specified
 | 
						|
exception to occur.
 | 
						|
For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> raise NameError, 'HiThere'
 | 
						|
Traceback (innermost last):
 | 
						|
  File "<stdin>", line 1
 | 
						|
NameError: HiThere
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
The first argument to {\tt raise} names the exception to be raised.
 | 
						|
The optional second argument specifies the exception's argument.
 | 
						|
 | 
						|
\section{User-defined Exceptions}
 | 
						|
 | 
						|
Programs may name their own exceptions by assigning a string to a
 | 
						|
variable.
 | 
						|
For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> my_exc = 'my_exc'
 | 
						|
>>> try:
 | 
						|
...     raise my_exc, 2*2
 | 
						|
... except my_exc, val:
 | 
						|
...     print 'My exception occurred, value:', val
 | 
						|
... 
 | 
						|
My exception occurred, value: 4
 | 
						|
>>> raise my_exc, 1
 | 
						|
Traceback (innermost last):
 | 
						|
  File "<stdin>", line 1
 | 
						|
my_exc: 1
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Many standard modules use this to report errors that may occur in
 | 
						|
functions they define.
 | 
						|
 | 
						|
\section{Defining Clean-up Actions}
 | 
						|
 | 
						|
The {\tt try} statement has another optional clause which is intended to
 | 
						|
define clean-up actions that must be executed under all circumstances.
 | 
						|
For example:
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> try:
 | 
						|
...     raise KeyboardInterrupt
 | 
						|
... finally:
 | 
						|
...     print 'Goodbye, world!'
 | 
						|
... 
 | 
						|
Goodbye, world!
 | 
						|
Traceback (innermost last):
 | 
						|
  File "<stdin>", line 2
 | 
						|
KeyboardInterrupt
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
A {\tt finally} clause is executed whether or not an exception has
 | 
						|
occurred in the {\tt try} clause.  When an exception has occurred, it
 | 
						|
is re-raised after the {\tt finally} clause is executed.  The
 | 
						|
{\tt finally} clause is also executed ``on the way out'' when the
 | 
						|
{\tt try} statement is left via a {\tt break} or {\tt return}
 | 
						|
statement.
 | 
						|
 | 
						|
A {\tt try} statement must either have one or more {\tt except}
 | 
						|
clauses or one {\tt finally} clause, but not both.
 | 
						|
 | 
						|
 | 
						|
\chapter{Classes}
 | 
						|
 | 
						|
Python's class mechanism adds classes to the language with a minimum
 | 
						|
of new syntax and semantics.  It is a mixture of the class mechanisms
 | 
						|
found in \Cpp{} and Modula-3.  As is true for modules, classes in Python
 | 
						|
do not put an absolute barrier between definition and user, but rather
 | 
						|
rely on the politeness of the user not to ``break into the
 | 
						|
definition.''  The most important features of classes are retained
 | 
						|
with full power, however: the class inheritance mechanism allows
 | 
						|
multiple base classes, a derived class can override any methods of its
 | 
						|
base class(es), a method can call the method of a base class with the
 | 
						|
same name.  Objects can contain an arbitrary amount of private data.
 | 
						|
 | 
						|
In \Cpp{} terminology, all class members (including the data members) are
 | 
						|
{\em public}, and all member functions are {\em virtual}.  There are
 | 
						|
no special constructors or destructors.  As in Modula-3, there are no
 | 
						|
shorthands for referencing the object's members from its methods: the
 | 
						|
method function is declared with an explicit first argument
 | 
						|
representing the object, which is provided implicitly by the call.  As
 | 
						|
in Smalltalk, classes themselves are objects, albeit in the wider
 | 
						|
sense of the word: in Python, all data types are objects.  This
 | 
						|
provides semantics for importing and renaming.  But, just like in \Cpp{}
 | 
						|
or Modula-3, built-in types cannot be used as base classes for
 | 
						|
extension by the user.  Also, like in \Cpp{} but unlike in Modula-3, most
 | 
						|
built-in operators with special syntax (arithmetic operators,
 | 
						|
subscripting etc.) can be redefined for class members.
 | 
						|
 | 
						|
 | 
						|
\section{A word about terminology}
 | 
						|
 | 
						|
Lacking universally accepted terminology to talk about classes, I'll
 | 
						|
make occasional use of Smalltalk and \Cpp{} terms.  (I'd use Modula-3
 | 
						|
terms, since its object-oriented semantics are closer to those of
 | 
						|
Python than \Cpp{}, but I expect that few readers have heard of it...)
 | 
						|
 | 
						|
I also have to warn you that there's a terminological pitfall for
 | 
						|
object-oriented readers: the word ``object'' in Python does not
 | 
						|
necessarily mean a class instance.  Like \Cpp{} and Modula-3, and unlike
 | 
						|
Smalltalk, not all types in Python are classes: the basic built-in
 | 
						|
types like integers and lists aren't, and even somewhat more exotic
 | 
						|
types like files aren't.  However, {\em all} Python types share a little
 | 
						|
bit of common semantics that is best described by using the word
 | 
						|
object.
 | 
						|
 | 
						|
Objects have individuality, and multiple names (in multiple scopes)
 | 
						|
can be bound to the same object.  This is known as aliasing in other
 | 
						|
languages.  This is usually not appreciated on a first glance at
 | 
						|
Python, and can be safely ignored when dealing with immutable basic
 | 
						|
types (numbers, strings, tuples).  However, aliasing has an
 | 
						|
(intended!) effect on the semantics of Python code involving mutable
 | 
						|
objects such as lists, dictionaries, and most types representing
 | 
						|
entities outside the program (files, windows, etc.).  This is usually
 | 
						|
used to the benefit of the program, since aliases behave like pointers
 | 
						|
in some respects.  For example, passing an object is cheap since only
 | 
						|
a pointer is passed by the implementation; and if a function modifies
 | 
						|
an object passed as an argument, the caller will see the change --- this
 | 
						|
obviates the need for two different argument passing mechanisms as in
 | 
						|
Pascal.
 | 
						|
 | 
						|
 | 
						|
\section{Python scopes and name spaces}
 | 
						|
 | 
						|
Before introducing classes, I first have to tell you something about
 | 
						|
Python's scope rules.  Class definitions play some neat tricks with
 | 
						|
name spaces, and you need to know how scopes and name spaces work to
 | 
						|
fully understand what's going on.  Incidentally, knowledge about this
 | 
						|
subject is useful for any advanced Python programmer.
 | 
						|
 | 
						|
Let's begin with some definitions.
 | 
						|
 | 
						|
A {\em name space} is a mapping from names to objects.  Most name
 | 
						|
spaces are currently implemented as Python dictionaries, but that's
 | 
						|
normally not noticeable in any way (except for performance), and it
 | 
						|
may change in the future.  Examples of name spaces are: the set of
 | 
						|
built-in names (functions such as \verb\abs()\, and built-in exception
 | 
						|
names); the global names in a module; and the local names in a
 | 
						|
function invocation.  In a sense the set of attributes of an object
 | 
						|
also form a name space.  The important thing to know about name
 | 
						|
spaces is that there is absolutely no relation between names in
 | 
						|
different name spaces; for instance, two different modules may both
 | 
						|
define a function ``maximize'' without confusion --- users of the
 | 
						|
modules must prefix it with the module name.
 | 
						|
 | 
						|
By the way, I use the word {\em attribute} for any name following a
 | 
						|
dot --- for example, in the expression \verb\z.real\, \verb\real\ is
 | 
						|
an attribute of the object \verb\z\.  Strictly speaking, references to
 | 
						|
names in modules are attribute references: in the expression
 | 
						|
\verb\modname.funcname\, \verb\modname\ is a module object and
 | 
						|
\verb\funcname\ is an attribute of it.  In this case there happens to
 | 
						|
be a straightforward mapping between the module's attributes and the
 | 
						|
global names defined in the module: they share the same name space!%
 | 
						|
\footnote{
 | 
						|
        Except for one thing.  Module objects have a secret read-only
 | 
						|
        attribute called {\tt __dict__} which returns the dictionary
 | 
						|
        used to implement the module's name space; the name
 | 
						|
        {\tt __dict__} is an attribute but not a global name.
 | 
						|
        Obviously, using this violates the abstraction of name space
 | 
						|
        implementation, and should be restricted to things like
 | 
						|
        post-mortem debuggers...
 | 
						|
}
 | 
						|
 | 
						|
Attributes may be read-only or writable.  In the latter case,
 | 
						|
assignment to attributes is possible.  Module attributes are writable:
 | 
						|
you can write \verb\modname.the_answer = 42\.  Writable attributes may
 | 
						|
also be deleted with the del statement, e.g.
 | 
						|
\verb\del modname.the_answer\.
 | 
						|
 | 
						|
Name spaces are created at different moments and have different
 | 
						|
lifetimes.  The name space containing the built-in names is created
 | 
						|
when the Python interpreter starts up, and is never deleted.  The
 | 
						|
global name space for a module is created when the module definition
 | 
						|
is read in; normally, module name spaces also last until the
 | 
						|
interpreter quits.  The statements executed by the top-level
 | 
						|
invocation of the interpreter, either read from a script file or
 | 
						|
interactively, are considered part of a module called \verb\__main__\,
 | 
						|
so they have their own global name space.  (The built-in names
 | 
						|
actually also live in a module; this is called \verb\__builtin__\.)
 | 
						|
 | 
						|
The local name space for a function is created when the function is
 | 
						|
called, and deleted when the function returns or raises an exception
 | 
						|
that is not handled within the function.  (Actually, forgetting would
 | 
						|
be a better way to describe what actually happens.)  Of course,
 | 
						|
recursive invocations each have their own local name space.
 | 
						|
 | 
						|
A {\em scope} is a textual region of a Python program where a name space
 | 
						|
is directly accessible.  ``Directly accessible'' here means that an
 | 
						|
unqualified reference to a name attempts to find the name in the name
 | 
						|
space.
 | 
						|
 | 
						|
Although scopes are determined statically, they are used dynamically.
 | 
						|
At any time during execution, exactly three nested scopes are in use
 | 
						|
(i.e., exactly three name spaces are directly accessible): the
 | 
						|
innermost scope, which is searched first, contains the local names,
 | 
						|
the middle scope, searched next, contains the current module's global
 | 
						|
names, and the outermost scope (searched last) is the name space
 | 
						|
containing built-in names.
 | 
						|
 | 
						|
Usually, the local scope references the local names of the (textually)
 | 
						|
current function.  Outside of functions, the the local scope references
 | 
						|
the same name space as the global scope: the module's name space.
 | 
						|
Class definitions place yet another name space in the local scope.
 | 
						|
 | 
						|
It is important to realize that scopes are determined textually: the
 | 
						|
global scope of a function defined in a module is that module's name
 | 
						|
space, no matter from where or by what alias the function is called.
 | 
						|
On the other hand, the actual search for names is done dynamically, at
 | 
						|
run time --- however, the the language definition is evolving towards
 | 
						|
static name resolution, at ``compile'' time, so don't rely on dynamic
 | 
						|
name resolution!  (In fact, local variables are already determined
 | 
						|
statically.)
 | 
						|
 | 
						|
A special quirk of Python is that assignments always go into the
 | 
						|
innermost scope.  Assignments do not copy data --- they just
 | 
						|
bind names to objects.  The same is true for deletions: the statement
 | 
						|
\verb\del x\ removes the binding of x from the name space referenced by the
 | 
						|
local scope.  In fact, all operations that introduce new names use the
 | 
						|
local scope: in particular, import statements and function definitions
 | 
						|
bind the module or function name in the local scope.  (The
 | 
						|
\verb\global\ statement can be used to indicate that particular
 | 
						|
variables live in the global scope.)
 | 
						|
 | 
						|
 | 
						|
\section{A first look at classes}
 | 
						|
 | 
						|
Classes introduce a little bit of new syntax, three new object types,
 | 
						|
and some new semantics.
 | 
						|
 | 
						|
 | 
						|
\subsection{Class definition syntax}
 | 
						|
 | 
						|
The simplest form of class definition looks like this:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        class ClassName:
 | 
						|
                <statement-1>
 | 
						|
                .
 | 
						|
                .
 | 
						|
                .
 | 
						|
                <statement-N>
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
Class definitions, like function definitions (\verb\def\ statements)
 | 
						|
must be executed before they have any effect.  (You could conceivably
 | 
						|
place a class definition in a branch of an \verb\if\ statement, or
 | 
						|
inside a function.)
 | 
						|
 | 
						|
In practice, the statements inside a class definition will usually be
 | 
						|
function definitions, but other statements are allowed, and sometimes
 | 
						|
useful --- we'll come back to this later.  The function definitions
 | 
						|
inside a class normally have a peculiar form of argument list,
 | 
						|
dictated by the calling conventions for methods --- again, this is
 | 
						|
explained later.
 | 
						|
 | 
						|
When a class definition is entered, a new name space is created, and
 | 
						|
used as the local scope --- thus, all assignments to local variables
 | 
						|
go into this new name space.  In particular, function definitions bind
 | 
						|
the name of the new function here.
 | 
						|
 | 
						|
When a class definition is left normally (via the end), a {\em class
 | 
						|
object} is created.  This is basically a wrapper around the contents
 | 
						|
of the name space created by the class definition; we'll learn more
 | 
						|
about class objects in the next section.  The original local scope
 | 
						|
(the one in effect just before the class definitions was entered) is
 | 
						|
reinstated, and the class object is bound here to class name given in
 | 
						|
the class definition header (ClassName in the example).
 | 
						|
 | 
						|
 | 
						|
\subsection{Class objects}
 | 
						|
 | 
						|
Class objects support two kinds of operations: attribute references
 | 
						|
and instantiation.
 | 
						|
 | 
						|
{\em Attribute references} use the standard syntax used for all
 | 
						|
attribute references in Python: \verb\obj.name\.  Valid attribute
 | 
						|
names are all the names that were in the class's name space when the
 | 
						|
class object was created.  So, if the class definition looked like
 | 
						|
this:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        class MyClass:
 | 
						|
                i = 12345
 | 
						|
                def f(x):
 | 
						|
                        return 'hello world'
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
then \verb\MyClass.i\ and \verb\MyClass.f\ are valid attribute
 | 
						|
references, returning an integer and a function object, respectively.
 | 
						|
Class attributes can also be assigned to, so you can change the
 | 
						|
value of \verb\MyClass.i\ by assignment.
 | 
						|
 | 
						|
Class {\em instantiation} uses function notation.  Just pretend that
 | 
						|
the class object is a parameterless function that returns a new
 | 
						|
instance of the class.  For example, (assuming the above class):
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        x = MyClass()
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
creates a new {\em instance} of the class and assigns this object to
 | 
						|
the local variable \verb\x\.
 | 
						|
 | 
						|
 | 
						|
\subsection{Instance objects}
 | 
						|
 | 
						|
Now what can we do with instance objects?  The only operations
 | 
						|
understood by instance objects are attribute references.  There are
 | 
						|
two kinds of valid attribute names.
 | 
						|
 | 
						|
The first I'll call {\em data attributes}.  These correspond to
 | 
						|
``instance variables'' in Smalltalk, and to ``data members'' in \Cpp{}.
 | 
						|
Data attributes need not be declared; like local variables, they
 | 
						|
spring into existence when they are first assigned to.  For example,
 | 
						|
if \verb\x\ in the instance of \verb\MyClass\ created above, the
 | 
						|
following piece of code will print the value 16, without leaving a
 | 
						|
trace:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        x.counter = 1
 | 
						|
        while x.counter < 10:
 | 
						|
                x.counter = x.counter * 2
 | 
						|
        print x.counter
 | 
						|
        del x.counter
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
The second kind of attribute references understood by instance objects
 | 
						|
are {\em methods}.  A method is a function that ``belongs to'' an
 | 
						|
object.  (In Python, the term method is not unique to class instances:
 | 
						|
other object types can have methods as well, e.g., list objects have
 | 
						|
methods called append, insert, remove, sort, and so on.  However,
 | 
						|
below, we'll use the term method exclusively to mean methods of class
 | 
						|
instance objects, unless explicitly stated otherwise.)
 | 
						|
 | 
						|
Valid method names of an instance object depend on its class.  By
 | 
						|
definition, all attributes of a class that are (user-defined) function
 | 
						|
objects define corresponding methods of its instances.  So in our
 | 
						|
example, \verb\x.f\ is a valid method reference, since
 | 
						|
\verb\MyClass.f\ is a function, but \verb\x.i\ is not, since
 | 
						|
\verb\MyClass.i\ is not.  But \verb\x.f\ is not the
 | 
						|
same thing as \verb\MyClass.f\ --- it is a {\em method object}, not a
 | 
						|
function object.
 | 
						|
 | 
						|
 | 
						|
\subsection{Method objects}
 | 
						|
 | 
						|
Usually, a method is called immediately, e.g.:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        x.f()
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
In our example, this will return the string \verb\'hello world'\.
 | 
						|
However, it is not necessary to call a method right away: \verb\x.f\
 | 
						|
is a method object, and can be stored away and called at a later
 | 
						|
moment, for example:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        xf = x.f
 | 
						|
        while 1:
 | 
						|
                print xf()
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
will continue to print \verb\hello world\ until the end of time.
 | 
						|
 | 
						|
What exactly happens when a method is called?  You may have noticed
 | 
						|
that \verb\x.f()\ was called without an argument above, even though
 | 
						|
the function definition for \verb\f\ specified an argument.  What
 | 
						|
happened to the argument?  Surely Python raises an exception when a
 | 
						|
function that requires an argument is called without any --- even if
 | 
						|
the argument isn't actually used...
 | 
						|
 | 
						|
Actually, you may have guessed the answer: the special thing about
 | 
						|
methods is that the object is passed as the first argument of the
 | 
						|
function.  In our example, the call \verb\x.f()\ is exactly equivalent
 | 
						|
to \verb\MyClass.f(x)\.  In general, calling a method with a list of
 | 
						|
{\em n} arguments is equivalent to calling the corresponding function
 | 
						|
with an argument list that is created by inserting the method's object
 | 
						|
before the first argument.
 | 
						|
 | 
						|
If you still don't understand how methods work, a look at the
 | 
						|
implementation can perhaps clarify matters.  When an instance
 | 
						|
attribute is referenced that isn't a data attribute, its class is
 | 
						|
searched.  If the name denotes a valid class attribute that is a
 | 
						|
function object, a method object is created by packing (pointers to)
 | 
						|
the instance object and the function object just found together in an
 | 
						|
abstract object: this is the method object.  When the method object is
 | 
						|
called with an argument list, it is unpacked again, a new argument
 | 
						|
list is constructed from the instance object and the original argument
 | 
						|
list, and the function object is called with this new argument list.
 | 
						|
 | 
						|
 | 
						|
\section{Random remarks}
 | 
						|
 | 
						|
 | 
						|
[These should perhaps be placed more carefully...]
 | 
						|
 | 
						|
 | 
						|
Data attributes override method attributes with the same name; to
 | 
						|
avoid accidental name conflicts, which may cause hard-to-find bugs in
 | 
						|
large programs, it is wise to use some kind of convention that
 | 
						|
minimizes the chance of conflicts, e.g., capitalize method names,
 | 
						|
prefix data attribute names with a small unique string (perhaps just
 | 
						|
an underscore), or use verbs for methods and nouns for data attributes.
 | 
						|
 | 
						|
 | 
						|
Data attributes may be referenced by methods as well as by ordinary
 | 
						|
users (``clients'') of an object.  In other words, classes are not
 | 
						|
usable to implement pure abstract data types.  In fact, nothing in
 | 
						|
Python makes it possible to enforce data hiding --- it is all based
 | 
						|
upon convention.  (On the other hand, the Python implementation,
 | 
						|
written in C, can completely hide implementation details and control
 | 
						|
access to an object if necessary; this can be used by extensions to
 | 
						|
Python written in C.)
 | 
						|
 | 
						|
 | 
						|
Clients should use data attributes with care --- clients may mess up
 | 
						|
invariants maintained by the methods by stamping on their data
 | 
						|
attributes.  Note that clients may add data attributes of their own to
 | 
						|
an instance object without affecting the validity of the methods, as
 | 
						|
long as name conflicts are avoided --- again, a naming convention can
 | 
						|
save a lot of headaches here.
 | 
						|
 | 
						|
 | 
						|
There is no shorthand for referencing data attributes (or other
 | 
						|
methods!) from within methods.  I find that this actually increases
 | 
						|
the readability of methods: there is no chance of confusing local
 | 
						|
variables and instance variables when glancing through a method.
 | 
						|
 | 
						|
 | 
						|
Conventionally, the first argument of methods is often called
 | 
						|
\verb\self\.  This is nothing more than a convention: the name
 | 
						|
\verb\self\ has absolutely no special meaning to Python.  (Note,
 | 
						|
however, that by not following the convention your code may be less
 | 
						|
readable by other Python programmers, and it is also conceivable that
 | 
						|
a {\em class browser} program be written which relies upon such a
 | 
						|
convention.)
 | 
						|
 | 
						|
 | 
						|
Any function object that is a class attribute defines a method for
 | 
						|
instances of that class.  It is not necessary that the function
 | 
						|
definition is textually enclosed in the class definition: assigning a
 | 
						|
function object to a local variable in the class is also ok.  For
 | 
						|
example:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        # Function defined outside the class
 | 
						|
        def f1(self, x, y):
 | 
						|
                return min(x, x+y)
 | 
						|
 | 
						|
        class C:
 | 
						|
                f = f1
 | 
						|
                def g(self):
 | 
						|
                        return 'hello world'
 | 
						|
                h = g
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
Now \verb\f\, \verb\g\ and \verb\h\ are all attributes of class
 | 
						|
\verb\C\ that refer to function objects, and consequently they are all
 | 
						|
methods of instances of \verb\C\ --- \verb\h\ being exactly equivalent
 | 
						|
to \verb\g\.  Note that this practice usually only serves to confuse
 | 
						|
the reader of a program.
 | 
						|
 | 
						|
 | 
						|
Methods may call other methods by using method attributes of the
 | 
						|
\verb\self\ argument, e.g.:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        class Bag:
 | 
						|
                def empty(self):
 | 
						|
                        self.data = []
 | 
						|
                def add(self, x):
 | 
						|
                        self.data.append(x)
 | 
						|
                def addtwice(self, x):
 | 
						|
                        self.add(x)
 | 
						|
                        self.add(x)
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
 | 
						|
The instantiation operation (``calling'' a class object) creates an
 | 
						|
empty object.  Many classes like to create objects in a known initial
 | 
						|
state.  Therefore a class may define a special method named
 | 
						|
\verb\__init__\, like this:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
                def __init__(self):
 | 
						|
                        self.empty()
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
When a class defines an \verb\__init__\ method, class instantiation
 | 
						|
automatically invokes \verb\__init__\ for the newly-created class
 | 
						|
instance.  So in the \verb\Bag\ example, a new and initialized instance
 | 
						|
can be obtained by:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        x = Bag()
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
Of course, the \verb\__init__\ method may have arguments for greater
 | 
						|
flexibility.  In that case, arguments given to the class instantiation
 | 
						|
operator are passed on to \verb\__init__\.  For example,
 | 
						|
 | 
						|
\bcode\begin{verbatim}
 | 
						|
>>> class Complex:
 | 
						|
...     def __init__(self, realpart, imagpart):
 | 
						|
...         self.r = realpart
 | 
						|
...         self.i = imagpart
 | 
						|
... 
 | 
						|
>>> x = Complex(3.0,-4.5)
 | 
						|
>>> x.r, x.i
 | 
						|
(3.0, -4.5)
 | 
						|
>>> 
 | 
						|
\end{verbatim}\ecode
 | 
						|
%
 | 
						|
Methods may reference global names in the same way as ordinary
 | 
						|
functions.  The global scope associated with a method is the module
 | 
						|
containing the class definition.  (The class itself is never used as a
 | 
						|
global scope!)  While one rarely encounters a good reason for using
 | 
						|
global data in a method, there are many legitimate uses of the global
 | 
						|
scope: for one thing, functions and modules imported into the global
 | 
						|
scope can be used by methods, as well as functions and classes defined
 | 
						|
in it.  Usually, the class containing the method is itself defined in
 | 
						|
this global scope, and in the next section we'll find some good
 | 
						|
reasons why a method would want to reference its own class!
 | 
						|
 | 
						|
 | 
						|
\section{Inheritance}
 | 
						|
 | 
						|
Of course, a language feature would not be worthy of the name ``class''
 | 
						|
without supporting inheritance.  The syntax for a derived class
 | 
						|
definition looks as follows:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        class DerivedClassName(BaseClassName):
 | 
						|
                <statement-1>
 | 
						|
                .
 | 
						|
                .
 | 
						|
                .
 | 
						|
                <statement-N>
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
The name \verb\BaseClassName\ must be defined in a scope containing
 | 
						|
the derived class definition.  Instead of a base class name, an
 | 
						|
expression is also allowed.  This is useful when the base class is
 | 
						|
defined in another module, e.g.,
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        class DerivedClassName(modname.BaseClassName):
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
Execution of a derived class definition proceeds the same as for a
 | 
						|
base class.  When the class object is constructed, the base class is
 | 
						|
remembered.  This is used for resolving attribute references: if a
 | 
						|
requested attribute is not found in the class, it is searched in the
 | 
						|
base class.  This rule is applied recursively if the base class itself
 | 
						|
is derived from some other class.
 | 
						|
 | 
						|
There's nothing special about instantiation of derived classes:
 | 
						|
\verb\DerivedClassName()\ creates a new instance of the class.  Method
 | 
						|
references are resolved as follows: the corresponding class attribute
 | 
						|
is searched, descending down the chain of base classes if necessary,
 | 
						|
and the method reference is valid if this yields a function object.
 | 
						|
 | 
						|
Derived classes may override methods of their base classes.  Because
 | 
						|
methods have no special privileges when calling other methods of the
 | 
						|
same object, a method of a base class that calls another method
 | 
						|
defined in the same base class, may in fact end up calling a method of
 | 
						|
a derived class that overrides it.  (For \Cpp{} programmers: all methods
 | 
						|
in Python are ``virtual functions''.)
 | 
						|
 | 
						|
An overriding method in a derived class may in fact want to extend
 | 
						|
rather than simply replace the base class method of the same name.
 | 
						|
There is a simple way to call the base class method directly: just
 | 
						|
call \verb\BaseClassName.methodname(self, arguments)\.  This is
 | 
						|
occasionally useful to clients as well.  (Note that this only works if
 | 
						|
the base class is defined or imported directly in the global scope.)
 | 
						|
 | 
						|
 | 
						|
\subsection{Multiple inheritance}
 | 
						|
 | 
						|
Python supports a limited form of multiple inheritance as well.  A
 | 
						|
class definition with multiple base classes looks as follows:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        class DerivedClassName(Base1, Base2, Base3):
 | 
						|
                <statement-1>
 | 
						|
                .
 | 
						|
                .
 | 
						|
                .
 | 
						|
                <statement-N>
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
The only rule necessary to explain the semantics is the resolution
 | 
						|
rule used for class attribute references.  This is depth-first,
 | 
						|
left-to-right.  Thus, if an attribute is not found in
 | 
						|
\verb\DerivedClassName\, it is searched in \verb\Base1\, then
 | 
						|
(recursively) in the base classes of \verb\Base1\, and only if it is
 | 
						|
not found there, it is searched in \verb\Base2\, and so on.
 | 
						|
 | 
						|
(To some people breadth first---searching \verb\Base2\ and
 | 
						|
\verb\Base3\ before the base classes of \verb\Base1\---looks more
 | 
						|
natural.  However, this would require you to know whether a particular
 | 
						|
attribute of \verb\Base1\ is actually defined in \verb\Base1\ or in
 | 
						|
one of its base classes before you can figure out the consequences of
 | 
						|
a name conflict with an attribute of \verb\Base2\.  The depth-first
 | 
						|
rule makes no differences between direct and inherited attributes of
 | 
						|
\verb\Base1\.)
 | 
						|
 | 
						|
It is clear that indiscriminate use of multiple inheritance is a
 | 
						|
maintenance nightmare, given the reliance in Python on conventions to
 | 
						|
avoid accidental name conflicts.  A well-known problem with multiple
 | 
						|
inheritance is a class derived from two classes that happen to have a
 | 
						|
common base class.  While it is easy enough to figure out what happens
 | 
						|
in this case (the instance will have a single copy of ``instance
 | 
						|
variables'' or data attributes used by the common base class), it is
 | 
						|
not clear that these semantics are in any way useful.
 | 
						|
 | 
						|
 | 
						|
\section{Odds and ends}
 | 
						|
 | 
						|
Sometimes it is useful to have a data type similar to the Pascal
 | 
						|
``record'' or C ``struct'', bundling together a couple of named data
 | 
						|
items.  An empty class definition will do nicely, e.g.:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        class Employee:
 | 
						|
                pass
 | 
						|
 | 
						|
        john = Employee() # Create an empty employee record
 | 
						|
 | 
						|
        # Fill the fields of the record
 | 
						|
        john.name = 'John Doe'
 | 
						|
        john.dept = 'computer lab'
 | 
						|
        john.salary = 1000
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
 | 
						|
A piece of Python code that expects a particular abstract data type
 | 
						|
can often be passed a class that emulates the methods of that data
 | 
						|
type instead.  For instance, if you have a function that formats some
 | 
						|
data from a file object, you can define a class with methods
 | 
						|
\verb\read()\ and \verb\readline()\ that gets the data from a string
 | 
						|
buffer instead, and pass it as an argument.  (Unfortunately, this
 | 
						|
technique has its limitations: a class can't define operations that
 | 
						|
are accessed by special syntax such as sequence subscripting or
 | 
						|
arithmetic operators, and assigning such a ``pseudo-file'' to
 | 
						|
\verb\sys.stdin\ will not cause the interpreter to read further input
 | 
						|
from it.)
 | 
						|
 | 
						|
 | 
						|
Instance method objects have attributes, too: \verb\m.im_self\ is the
 | 
						|
object of which the method is an instance, and \verb\m.im_func\ is the
 | 
						|
function object corresponding to the method.
 | 
						|
 | 
						|
 | 
						|
\chapter{Recent Additions}
 | 
						|
 | 
						|
Python is an evolving language.  Since this tutorial was last
 | 
						|
thoroughly revised, several new features have been added to the
 | 
						|
language.  While ideally I should revise the tutorial to incorporate
 | 
						|
them in the mainline of the text, lack of time currently requires me
 | 
						|
to take a more modest approach.  In this chapter I will briefly list the
 | 
						|
most important improvements to the language and how you can use them
 | 
						|
to your benefit.
 | 
						|
 | 
						|
\section{The Last Printed Expression}
 | 
						|
 | 
						|
In interactive mode, the last printed expression is assigned to the
 | 
						|
variable \code\_.  This means that when you are using Python as a
 | 
						|
desk calculator, it is somewhat easier to continue calculations, for
 | 
						|
example:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> tax = 17.5 / 100
 | 
						|
        >>> price = 3.50
 | 
						|
        >>> price * tax
 | 
						|
        0.6125
 | 
						|
        >>> price + _
 | 
						|
        4.1125
 | 
						|
        >>> round(_, 2)
 | 
						|
        4.11
 | 
						|
        >>> 
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\section{String Literals}
 | 
						|
 | 
						|
\subsection{Double Quotes}
 | 
						|
 | 
						|
Python can now also use double quotes to surround string literals,
 | 
						|
e.g. \verb\"this doesn't hurt a bit"\.
 | 
						|
 | 
						|
\subsection{Continuation Of String Literals}
 | 
						|
 | 
						|
String literals can span multiple lines by escaping newlines with
 | 
						|
backslashes, e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        hello = "This is a rather long string containing\n\
 | 
						|
        several lines of text just as you would do in C.\n\
 | 
						|
            Note that whitespace at the beginning of the line is\
 | 
						|
         significant.\n"
 | 
						|
        print hello
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
which would print the following:
 | 
						|
\begin{verbatim}
 | 
						|
        This is a rather long string containing
 | 
						|
        several lines of text just as you would do in C.
 | 
						|
            Note that whitespace at the beginning of the line is significant.
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\subsection{Triple-quoted strings}
 | 
						|
 | 
						|
In some cases, when you need to include really long strings (e.g.
 | 
						|
containing several paragraphs of informational text), it is annoying
 | 
						|
that you have to terminate each line with \verb@\n\@, especially if
 | 
						|
you would like to reformat the text occasionally with a powerful text
 | 
						|
editor like Emacs.  For such situations, ``triple-quoted'' strings can
 | 
						|
be used, e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        hello = """
 | 
						|
 | 
						|
            This string is bounded by triple double quotes (3 times ").
 | 
						|
        Newlines in the string are retained, though \
 | 
						|
        it is still possible\nto use all normal escape sequences.
 | 
						|
 | 
						|
            Whitespace at the beginning of a line is
 | 
						|
        significant.  If you need to include three opening quotes
 | 
						|
        you have to escape at least one of them, e.g. \""".
 | 
						|
 | 
						|
            This string ends in a newline.
 | 
						|
        """
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
Note that there is no semantic difference between strings quoted with
 | 
						|
single quotes (\verb/'/) or double quotes (\verb\"\).
 | 
						|
 | 
						|
\subsection{String Literal Juxtaposition}
 | 
						|
 | 
						|
One final twist: you can juxtapose multiple string literals.  Two or
 | 
						|
more adjacent string literals (but not arbitrary expressions!)
 | 
						|
separated only by whitespace will be concatenated (without intervening
 | 
						|
whitespace) into a single string object at compile time.  This makes
 | 
						|
it possible to continue a long string on the next line without
 | 
						|
sacrificing indentation or performance, unlike the use of the string
 | 
						|
concatenation operator \verb\+\ or the continuation of the literal
 | 
						|
itself on the next line (since leading whitespace is significant
 | 
						|
inside all types of string literals).  Note that this feature, like
 | 
						|
all string features except triple-quoted strings, is borrowed from
 | 
						|
Standard C.
 | 
						|
 | 
						|
\section{The Formatting Operator}
 | 
						|
 | 
						|
\subsection{Basic Usage}
 | 
						|
 | 
						|
The chapter on output formatting is really out of date: there is now
 | 
						|
an almost complete interface to C-style printf formats.  This is done
 | 
						|
by overloading the modulo operator (\verb\%\) for a left operand
 | 
						|
which is a string, e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> import math
 | 
						|
        >>> print 'The value of PI is approximately %5.3f.' % math.pi
 | 
						|
        The value of PI is approximately 3.142.
 | 
						|
        >>> 
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
If there is more than one format in the string you pass a tuple as
 | 
						|
right operand, e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
 | 
						|
        >>> for name, phone in table.items():
 | 
						|
        ...     print '%-10s ==> %10d' % (name, phone)
 | 
						|
        ... 
 | 
						|
        Jack       ==>       4098
 | 
						|
        Dcab       ==>    8637678
 | 
						|
        Sjoerd     ==>       4127
 | 
						|
        >>> 
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
Most formats work exactly as in C and require that you pass the proper
 | 
						|
type (however, if you don't you get an exception, not a core dump).
 | 
						|
The \verb\%s\ format is more relaxed: if the corresponding argument is
 | 
						|
not a string object, it is converted to string using the \verb\str()\
 | 
						|
built-in function.  Using \verb\*\ to pass the width or precision in
 | 
						|
as a separate (integer) argument is supported.  The C formats
 | 
						|
\verb\%n\ and \verb\%p\ are not supported.
 | 
						|
 | 
						|
\subsection{Referencing Variables By Name}
 | 
						|
 | 
						|
If you have a really long format string that you don't want to split
 | 
						|
up, it would be nice if you could reference the variables to be
 | 
						|
formatted by name instead of by position.  This can be done by using
 | 
						|
an extension of C formats using the form \verb\%(name)format\, e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
 | 
						|
        >>> print 'Jack: %(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table
 | 
						|
        Jack: 4098; Sjoerd: 4127; Dcab: 8637678
 | 
						|
        >>> 
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
This is particularly useful in combination with the new built-in
 | 
						|
\verb\vars()\ function, which returns a dictionary containing all
 | 
						|
local variables.
 | 
						|
 | 
						|
\section{Optional Function Arguments}
 | 
						|
 | 
						|
It is now possible to define functions with a variable number of
 | 
						|
arguments.  There are two forms, which can be combined.
 | 
						|
 | 
						|
\subsection{Default Argument Values}
 | 
						|
 | 
						|
The most useful form is to specify a default value for one or more
 | 
						|
arguments.  This creates a function that can be called with fewer
 | 
						|
arguments than it is defined, e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        def ask_ok(prompt, retries = 4, complaint = 'Yes or no, please!'):
 | 
						|
                while 1:
 | 
						|
                        ok = raw_input(prompt)
 | 
						|
                        if ok in ('y', 'ye', 'yes'): return 1
 | 
						|
                        if ok in ('n', 'no', 'nop', 'nope'): return 0
 | 
						|
                        retries = retries - 1
 | 
						|
                        if retries < 0: raise IOError, 'refusenik user'
 | 
						|
                        print complaint
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
This function can be called either like this:
 | 
						|
\verb\ask_ok('Do you really want to quit?')\ or like this:
 | 
						|
\verb\ask_ok('OK to overwrite the file?', 2)\.
 | 
						|
 | 
						|
The default values are evaluated at the point of function definition
 | 
						|
in the {\em defining} scope, so that e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        i = 5
 | 
						|
        def f(arg = i): print arg
 | 
						|
        i = 6
 | 
						|
        f()
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
will print \verb\5\.
 | 
						|
 | 
						|
\subsection{Arbitrary Argument Lists}
 | 
						|
 | 
						|
It is also possible to specify that a function can be called with an
 | 
						|
arbitrary number of arguments.  These arguments will be wrapped up in
 | 
						|
a tuple.  Before the variable number of arguments, zero or more normal
 | 
						|
arguments may occur, e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        def fprintf(file, format, *args):
 | 
						|
                file.write(format % args)
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
This feature may be combined with the previous, e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        def but_is_it_useful(required, optional = None, *remains):
 | 
						|
                print "I don't know"
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\section{Lambda And Functional Programming Tools}
 | 
						|
 | 
						|
\subsection{Lambda Forms}
 | 
						|
 | 
						|
By popular demand, a few features commonly found in functional
 | 
						|
programming languages and Lisp have been added to Python.  With the
 | 
						|
\verb\lambda\ keyword, small anonymous functions can be created.
 | 
						|
Here's a function that returns the sum of its two arguments:
 | 
						|
\verb\lambda a, b: a+b\.  Lambda forms can be used wherever function
 | 
						|
objects are required.  They are syntactically restricted to a single
 | 
						|
expression.  Semantically, they are just syntactic sugar for a normal
 | 
						|
function definition.  Like nested function definitions, lambda forms
 | 
						|
cannot reference variables from the containing scope, but this can be
 | 
						|
overcome through the judicious use of default argument values, e.g.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        def make_incrementor(n):
 | 
						|
                return lambda x, incr=n: x+incr
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\subsection{Map, Reduce and Filter}
 | 
						|
 | 
						|
Three new built-in functions on sequences are good candidate to pass
 | 
						|
lambda forms.
 | 
						|
 | 
						|
\subsubsection{Map.}
 | 
						|
 | 
						|
\verb\map(function, sequence)\ calls \verb\function(item)\ for each of
 | 
						|
the sequence's items and returns a list of the return values.  For
 | 
						|
example, to compute some cubes:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> map(lambda x: x*x*x, range(1, 11))
 | 
						|
        [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
 | 
						|
        >>>
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
More than one sequence may be passed; the function must then have as
 | 
						|
many arguments as there are sequences and is called with the
 | 
						|
corresponding item from each sequence (or \verb\None\ if some sequence
 | 
						|
is shorter than another).  If \verb\None\ is passed for the function,
 | 
						|
a function returning its argument(s) is substituted.
 | 
						|
 | 
						|
Combining these two special cases, we see that
 | 
						|
\verb\map(None, list1, list2)\  is a convenient way of turning a pair
 | 
						|
of lists into a list of pairs.  For example:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> seq = range(8)
 | 
						|
        >>> map(None, seq, map(lambda x: x*x, seq))
 | 
						|
        [(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49)]
 | 
						|
        >>> 
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\subsubsection{Filter.}
 | 
						|
 | 
						|
\verb\filter(function, sequence)\ returns a sequence (of the same
 | 
						|
type, if possible) consisting of those items from the sequence for
 | 
						|
which \verb\function(item)\ is true.  For example, to compute some
 | 
						|
primes:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> filter(lambda x: x%2 != 0 and x%3 != 0, range(2, 25))
 | 
						|
        [5, 7, 11, 13, 17, 19, 23]
 | 
						|
        >>>
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\subsubsection{Reduce.}
 | 
						|
 | 
						|
\verb\reduce(function, sequence)\ returns a single value constructed
 | 
						|
by calling the (binary) function on the first two items of the
 | 
						|
sequence, then on the result and the next item, and so on.  For
 | 
						|
example, to compute the sum of the numbers 1 through 10:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> reduce(lambda x, y: x+y, range(1, 11))
 | 
						|
        55
 | 
						|
        >>> 
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
If there's only one item in the sequence, its value is returned; if
 | 
						|
the sequence is empty, an exception is raised.
 | 
						|
 | 
						|
A third argument can be passed to indicate the starting value.  In this
 | 
						|
case the starting value is returned for an empty sequence, and the
 | 
						|
function is first applied to the starting value and the first sequence
 | 
						|
item, then to the result and the next item, and so on.  For example,
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> def sum(seq):
 | 
						|
        ...     return reduce(lambda x, y: x+y, seq, 0)
 | 
						|
        ... 
 | 
						|
        >>> sum(range(1, 11))
 | 
						|
        55
 | 
						|
        >>> sum([])
 | 
						|
        0
 | 
						|
        >>> 
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\section{Continuation Lines Without Backslashes}
 | 
						|
 | 
						|
While the general mechanism for continuation of a source line on the
 | 
						|
next physical line remains to place a backslash on the end of the
 | 
						|
line, expressions inside matched parentheses (or square brackets, or
 | 
						|
curly braces) can now also be continued without using a backslash.
 | 
						|
This is particularly useful for calls to functions with many
 | 
						|
arguments, and for initializations of large tables.
 | 
						|
 | 
						|
For example:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        month_names = ['Januari', 'Februari', 'Maart', 
 | 
						|
                       'April',   'Mei',      'Juni', 
 | 
						|
                       'Juli',    'Augustus', 'September',
 | 
						|
                       'Oktober', 'November', 'December']
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
and
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        CopyInternalHyperLinks(self.context.hyperlinks,
 | 
						|
                               copy.context.hyperlinks,
 | 
						|
                               uidremap)
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\section{Regular Expressions}
 | 
						|
 | 
						|
While C's printf-style output formats, transformed into Python, are
 | 
						|
adequate for most output formatting jobs, C's scanf-style input
 | 
						|
formats are not very powerful.  Instead of scanf-style input, Python
 | 
						|
offers Emacs-style regular expressions as a powerful input and
 | 
						|
scanning mechanism.  Read the corresponding section in the Library
 | 
						|
Reference for a full description.
 | 
						|
 | 
						|
\section{Generalized Dictionaries}
 | 
						|
 | 
						|
The keys of dictionaries are no longer restricted to strings -- they
 | 
						|
can be
 | 
						|
any immutable basic type including strings,
 | 
						|
numbers, tuples, or (certain) class instances.  (Lists and
 | 
						|
dictionaries are not acceptable as dictionary keys, in order to avoid
 | 
						|
problems when the object used as a key is modified.)
 | 
						|
 | 
						|
Dictionaries have two new methods: \verb\d.values()\ returns a list of
 | 
						|
the dictionary's values, and \verb\d.items()\ returns a list of the
 | 
						|
dictionary's (key, value) pairs.  Like \verb\d.keys()\, these
 | 
						|
operations are slow for large dictionaries.  Examples:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        >>> d = {100: 'honderd', 1000: 'duizend', 10: 'tien'}
 | 
						|
        >>> d.keys()
 | 
						|
        [100, 10, 1000]
 | 
						|
        >>> d.values()
 | 
						|
        ['honderd', 'tien', 'duizend']
 | 
						|
        >>> d.items()
 | 
						|
        [(100, 'honderd'), (10, 'tien'), (1000, 'duizend')]
 | 
						|
        >>> 
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\section{Miscellaneous New Built-in Functions}
 | 
						|
 | 
						|
The function \verb\vars()\ returns a dictionary containing the current
 | 
						|
local variables.  With a module argument, it returns that module's
 | 
						|
global variables.  The old function \verb\dir(x)\ returns
 | 
						|
\verb\vars(x).keys()\.
 | 
						|
 | 
						|
The function \verb\round(x)\ returns a floating point number rounded
 | 
						|
to the nearest integer (but still expressed as a floating point
 | 
						|
number).  E.g. \verb\round(3.4) == 3.0\ and \verb\round(3.5) == 4.0\.
 | 
						|
With a second argument it rounds to the specified number of digits,
 | 
						|
e.g. \verb\round(math.pi, 4) == 3.1416\ or even
 | 
						|
\verb\round(123.4, -2) == 100.0\. 
 | 
						|
 | 
						|
The function \verb\hash(x)\ returns a hash value for an object.
 | 
						|
All object types acceptable as dictionary keys have a hash value (and
 | 
						|
it is this hash value that the dictionary implementation uses).
 | 
						|
 | 
						|
The function \verb\id(x)\ return a unique identifier for an object.
 | 
						|
For two objects x and y, \verb\id(x) == id(y)\ if and only if
 | 
						|
\verb\x is y\.  (In fact the object's address is used.)
 | 
						|
 | 
						|
The function \verb\hasattr(x, name)\ returns whether an object has an
 | 
						|
attribute with the given name (a string value).  The function
 | 
						|
\verb\getattr(x, name)\ returns the object's attribute with the given
 | 
						|
name.  The function \verb\setattr(x, name, value)\ assigns a value to
 | 
						|
an object's attribute with the given name.  These three functions are
 | 
						|
useful if the attribute names are not known beforehand.  Note that
 | 
						|
\verb\getattr(x, 'spam')\ is equivalent to \verb\x.spam\, and
 | 
						|
\verb\setattr(x, 'spam', y)\ is equivalent to \verb\x.spam = y\.  By
 | 
						|
definition, \verb\hasattr(x, name)\ returns true if and only if
 | 
						|
\verb\getattr(x, name)\ returns without raising an exception.
 | 
						|
 | 
						|
\section{Else Clause For Try Statement}
 | 
						|
 | 
						|
The \verb\try...except\ statement now has an optional \verb\else\
 | 
						|
clause, which must follow all \verb\except\ clauses.  It is useful to
 | 
						|
place code that must be executed if the \verb\try\ clause does not
 | 
						|
raise an exception.  For example:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
        for arg in sys.argv:
 | 
						|
                try:
 | 
						|
                        f = open(arg, 'r')
 | 
						|
                except IOError:
 | 
						|
                        print 'cannot open', arg
 | 
						|
                else:
 | 
						|
                        print arg, 'has', len(f.readlines()), 'lines'
 | 
						|
                        f.close()
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
 | 
						|
\section{New Class Features in Release 1.1}
 | 
						|
 | 
						|
Some changes have been made to classes: the operator overloading
 | 
						|
mechanism is more flexible, providing more support for non-numeric use
 | 
						|
of operators (including calling an object as if it were a function),
 | 
						|
and it is possible to trap attribute accesses.
 | 
						|
 | 
						|
\subsection{New Operator Overloading}
 | 
						|
 | 
						|
It is no longer necessary to coerce both sides of an operator to the
 | 
						|
same class or type.  A class may still provide a \code{__coerce__}
 | 
						|
method, but this method may return objects of different types or
 | 
						|
classes if it feels like it.  If no \code{__coerce__} is defined, any
 | 
						|
argument type or class is acceptable.
 | 
						|
 | 
						|
In order to make it possible to implement binary operators where the
 | 
						|
right-hand side is a class instance but the left-hand side is not,
 | 
						|
without using coercions, right-hand versions of all binary operators
 | 
						|
may be defined.  These have an `r' prepended to their name,
 | 
						|
e.g. \code{__radd__}.
 | 
						|
 | 
						|
For example, here's a very simple class for representing times.  Times
 | 
						|
are initialized from a number of seconds (like time.time()).  Times
 | 
						|
are printed like this: \code{Thu Oct 6 14:20:06 1994}.  Subtracting
 | 
						|
two Times gives their difference in seconds.  Adding or subtracting a
 | 
						|
Time and a number gives a new Time.  You can't add two times, nor can
 | 
						|
you subtract a Time from a number.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
import time
 | 
						|
 | 
						|
class Time:
 | 
						|
    def __init__(self, seconds):
 | 
						|
        self.seconds = seconds
 | 
						|
    def __repr__(self):
 | 
						|
        return time.ctime(self.seconds)
 | 
						|
    def __add__(self, x):
 | 
						|
        return Time(self.seconds + x)
 | 
						|
    __radd__ = __add__            # support for x+t
 | 
						|
    def __sub__(self, x):
 | 
						|
        if hasattr(x, 'seconds'): # test if x could be a Time
 | 
						|
            return self.seconds - x.seconds
 | 
						|
        else:
 | 
						|
            return self.seconds - x
 | 
						|
 | 
						|
now = Time(time.time())
 | 
						|
tomorrow = 24*3600 + now
 | 
						|
yesterday = now - today
 | 
						|
print tomorrow - yesterday        # prints 172800
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\subsection{Trapping Attribute Access}
 | 
						|
 | 
						|
You can define three new ``magic'' methods in a class now:
 | 
						|
\code{__getattr__(self, name)}, \code{__setattr__(self, name, value)}
 | 
						|
and \code{__delattr__(self, name)}.
 | 
						|
 | 
						|
The \code{__getattr__} method is called when an attribute access fails,
 | 
						|
i.e. when an attribute access would otherwise raise AttributeError --
 | 
						|
this is {\em after} the instance's dictionary and its class hierarchy
 | 
						|
have been searched for the named attribute.  Note that if this method
 | 
						|
attempts to access any undefined instance attribute it will be called
 | 
						|
recursively!
 | 
						|
 | 
						|
The \code{__setattr__} and \code{__delattr__} methods are called when
 | 
						|
assignment to, respectively deletion of an attribute are attempted.
 | 
						|
They are called {\em instead} of the normal action (which is to insert
 | 
						|
or delete the attribute in the instance dictionary).  If either of
 | 
						|
these methods most set or delete any attribute, they can only do so by
 | 
						|
using the instance dictionary directly -- \code{self.__dict__} -- else
 | 
						|
they would be called recursively.
 | 
						|
 | 
						|
For example, here's a near-universal ``Wrapper'' class that passes all
 | 
						|
its attribute accesses to another object.  Note how the
 | 
						|
\code{__init__} method inserts the wrapped object in
 | 
						|
\code{self.__dict__} in order to avoid endless recursion
 | 
						|
(\code{__setattr__} would call \code{__getattr__} which would call
 | 
						|
itself recursively).
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
class Wrapper:
 | 
						|
    def __init__(self, wrapped):
 | 
						|
        self.__dict__['wrapped'] = wrapped
 | 
						|
    def __getattr__(self, name):
 | 
						|
        return getattr(self.wrapped, name)
 | 
						|
    def __setattr__(self, name, value):
 | 
						|
        setattr(self.wrapped, name, value)
 | 
						|
    def __delattr__(self, name):
 | 
						|
        delattr(self.wrapped, name)
 | 
						|
 | 
						|
import sys
 | 
						|
f = Wrapper(sys.stdout)
 | 
						|
f.write('hello world\n')          # prints 'hello world'
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
A simpler example of \code{__getattr__} is an attribute that is
 | 
						|
computed each time (or the first time) it it accessed.  For instance:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
from math import pi
 | 
						|
 | 
						|
class Circle:
 | 
						|
    def __init__(self, radius):
 | 
						|
        self.radius = radius
 | 
						|
    def __getattr__(self, name):
 | 
						|
        if name == 'circumference':
 | 
						|
            return 2 * pi * self.radius
 | 
						|
        if name == 'diameter':
 | 
						|
            return 2 * self.radius
 | 
						|
        if name == 'area':
 | 
						|
           return pi * pow(self.radius, 2)
 | 
						|
        raise AttributeError, name
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\subsection{Calling a Class Instance}
 | 
						|
 | 
						|
If a class defines a method \code{__call__} it is possible to call its
 | 
						|
instances as if they were functions.  For example:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
class PresetSomeArguments:
 | 
						|
    def __init__(self, func, *args):
 | 
						|
        self.func, self.args = func, args
 | 
						|
    def __call__(self, *args):
 | 
						|
        return apply(self.func, self.args + args)
 | 
						|
 | 
						|
f = PresetSomeArguments(pow, 2)    # f(i) computes powers of 2
 | 
						|
for i in range(10): print f(i),    # prints 1 2 4 8 16 32 64 128 256 512
 | 
						|
print                              # append newline
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
\end{document}
 |