mirror of
				https://github.com/python/cpython.git
				synced 2025-11-04 03:44:55 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			889 lines
		
	
	
	
		
			33 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			889 lines
		
	
	
	
		
			33 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
:tocdepth: 2
 | 
						|
 | 
						|
=========================
 | 
						|
Library and Extension FAQ
 | 
						|
=========================
 | 
						|
 | 
						|
.. contents::
 | 
						|
 | 
						|
General Library Questions
 | 
						|
=========================
 | 
						|
 | 
						|
How do I find a module or application to perform task X?
 | 
						|
--------------------------------------------------------
 | 
						|
 | 
						|
Check :ref:`the Library Reference <library-index>` to see if there's a relevant
 | 
						|
standard library module.  (Eventually you'll learn what's in the standard
 | 
						|
library and will able to skip this step.)
 | 
						|
 | 
						|
For third-party packages, search the `Python Package Index
 | 
						|
<http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
 | 
						|
another Web search engine.  Searching for "Python" plus a keyword or two for
 | 
						|
your topic of interest will usually find something helpful.
 | 
						|
 | 
						|
 | 
						|
Where is the math.py (socket.py, regex.py, etc.) source file?
 | 
						|
-------------------------------------------------------------
 | 
						|
 | 
						|
If you can't find a source file for a module it may be a built-in or
 | 
						|
dynamically loaded module implemented in C, C++ or other compiled language.
 | 
						|
In this case you may not have the source file or it may be something like
 | 
						|
mathmodule.c, somewhere in a C source directory (not on the Python Path).
 | 
						|
 | 
						|
There are (at least) three kinds of modules in Python:
 | 
						|
 | 
						|
1) modules written in Python (.py);
 | 
						|
2) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
 | 
						|
3) modules written in C and linked with the interpreter; to get a list of these,
 | 
						|
   type::
 | 
						|
 | 
						|
      import sys
 | 
						|
      print(sys.builtin_module_names)
 | 
						|
 | 
						|
 | 
						|
How do I make a Python script executable on Unix?
 | 
						|
-------------------------------------------------
 | 
						|
 | 
						|
You need to do two things: the script file's mode must be executable and the
 | 
						|
first line must begin with ``#!`` followed by the path of the Python
 | 
						|
interpreter.
 | 
						|
 | 
						|
The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
 | 
						|
scriptfile``.
 | 
						|
 | 
						|
The second can be done in a number of ways.  The most straightforward way is to
 | 
						|
write ::
 | 
						|
 | 
						|
  #!/usr/local/bin/python
 | 
						|
 | 
						|
as the very first line of your file, using the pathname for where the Python
 | 
						|
interpreter is installed on your platform.
 | 
						|
 | 
						|
If you would like the script to be independent of where the Python interpreter
 | 
						|
lives, you can use the "env" program.  Almost all Unix variants support the
 | 
						|
following, assuming the Python interpreter is in a directory on the user's
 | 
						|
$PATH::
 | 
						|
 | 
						|
  #!/usr/bin/env python
 | 
						|
 | 
						|
*Don't* do this for CGI scripts.  The $PATH variable for CGI scripts is often
 | 
						|
very minimal, so you need to use the actual absolute pathname of the
 | 
						|
interpreter.
 | 
						|
 | 
						|
Occasionally, a user's environment is so full that the /usr/bin/env program
 | 
						|
fails; or there's no env program at all.  In that case, you can try the
 | 
						|
following hack (due to Alex Rezinsky)::
 | 
						|
 | 
						|
   #! /bin/sh
 | 
						|
   """:"
 | 
						|
   exec python $0 ${1+"$@"}
 | 
						|
   """
 | 
						|
 | 
						|
The minor disadvantage is that this defines the script's __doc__ string.
 | 
						|
However, you can fix that by adding ::
 | 
						|
 | 
						|
   __doc__ = """...Whatever..."""
 | 
						|
 | 
						|
 | 
						|
 | 
						|
Is there a curses/termcap package for Python?
 | 
						|
---------------------------------------------
 | 
						|
 | 
						|
.. XXX curses *is* built by default, isn't it?
 | 
						|
 | 
						|
For Unix variants: The standard Python source distribution comes with a curses
 | 
						|
module in the ``Modules/`` subdirectory, though it's not compiled by default
 | 
						|
(note that this is not available in the Windows distribution -- there is no
 | 
						|
curses module for Windows).
 | 
						|
 | 
						|
The curses module supports basic curses features as well as many additional
 | 
						|
functions from ncurses and SYSV curses such as colour, alternative character set
 | 
						|
support, pads, and mouse support. This means the module isn't compatible with
 | 
						|
operating systems that only have BSD curses, but there don't seem to be any
 | 
						|
currently maintained OSes that fall into this category.
 | 
						|
 | 
						|
For Windows: use `the consolelib module
 | 
						|
<http://effbot.org/zone/console-index.htm>`_.
 | 
						|
 | 
						|
 | 
						|
Is there an equivalent to C's onexit() in Python?
 | 
						|
-------------------------------------------------
 | 
						|
 | 
						|
The :mod:`atexit` module provides a register function that is similar to C's
 | 
						|
onexit.
 | 
						|
 | 
						|
 | 
						|
Why don't my signal handlers work?
 | 
						|
----------------------------------
 | 
						|
 | 
						|
The most common problem is that the signal handler is declared with the wrong
 | 
						|
argument list.  It is called as ::
 | 
						|
 | 
						|
   handler(signum, frame)
 | 
						|
 | 
						|
so it should be declared with two arguments::
 | 
						|
 | 
						|
   def handler(signum, frame):
 | 
						|
       ...
 | 
						|
 | 
						|
 | 
						|
Common tasks
 | 
						|
============
 | 
						|
 | 
						|
How do I test a Python program or component?
 | 
						|
--------------------------------------------
 | 
						|
 | 
						|
Python comes with two testing frameworks.  The :mod:`doctest` module finds
 | 
						|
examples in the docstrings for a module and runs them, comparing the output with
 | 
						|
the expected output given in the docstring.
 | 
						|
 | 
						|
The :mod:`unittest` module is a fancier testing framework modelled on Java and
 | 
						|
Smalltalk testing frameworks.
 | 
						|
 | 
						|
For testing, it helps to write the program so that it may be easily tested by
 | 
						|
using good modular design.  Your program should have almost all functionality
 | 
						|
encapsulated in either functions or class methods -- and this sometimes has the
 | 
						|
surprising and delightful effect of making the program run faster (because local
 | 
						|
variable accesses are faster than global accesses).  Furthermore the program
 | 
						|
should avoid depending on mutating global variables, since this makes testing
 | 
						|
much more difficult to do.
 | 
						|
 | 
						|
The "global main logic" of your program may be as simple as ::
 | 
						|
 | 
						|
   if __name__ == "__main__":
 | 
						|
       main_logic()
 | 
						|
 | 
						|
at the bottom of the main module of your program.
 | 
						|
 | 
						|
Once your program is organized as a tractable collection of functions and class
 | 
						|
behaviours you should write test functions that exercise the behaviours.  A test
 | 
						|
suite can be associated with each module which automates a sequence of tests.
 | 
						|
This sounds like a lot of work, but since Python is so terse and flexible it's
 | 
						|
surprisingly easy.  You can make coding much more pleasant and fun by writing
 | 
						|
your test functions in parallel with the "production code", since this makes it
 | 
						|
easy to find bugs and even design flaws earlier.
 | 
						|
 | 
						|
"Support modules" that are not intended to be the main module of a program may
 | 
						|
include a self-test of the module. ::
 | 
						|
 | 
						|
   if __name__ == "__main__":
 | 
						|
       self_test()
 | 
						|
 | 
						|
Even programs that interact with complex external interfaces may be tested when
 | 
						|
the external interfaces are unavailable by using "fake" interfaces implemented
 | 
						|
in Python.
 | 
						|
 | 
						|
 | 
						|
How do I create documentation from doc strings?
 | 
						|
-----------------------------------------------
 | 
						|
 | 
						|
The :mod:`pydoc` module can create HTML from the doc strings in your Python
 | 
						|
source code.  An alternative for creating API documentation purely from
 | 
						|
docstrings is `epydoc <http://epydoc.sf.net/>`_.  `Sphinx
 | 
						|
<http://sphinx.pocoo.org>`_ can also include docstring content.
 | 
						|
 | 
						|
 | 
						|
How do I get a single keypress at a time?
 | 
						|
-----------------------------------------
 | 
						|
 | 
						|
For Unix variants: There are several solutions.  It's straightforward to do this
 | 
						|
using curses, but curses is a fairly large module to learn.
 | 
						|
 | 
						|
.. XXX this doesn't work out of the box, some IO expert needs to check why
 | 
						|
 | 
						|
   Here's a solution without curses::
 | 
						|
 | 
						|
   import termios, fcntl, sys, os
 | 
						|
   fd = sys.stdin.fileno()
 | 
						|
 | 
						|
   oldterm = termios.tcgetattr(fd)
 | 
						|
   newattr = termios.tcgetattr(fd)
 | 
						|
   newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
 | 
						|
   termios.tcsetattr(fd, termios.TCSANOW, newattr)
 | 
						|
 | 
						|
   oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
 | 
						|
   fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
 | 
						|
 | 
						|
   try:
 | 
						|
       while True:
 | 
						|
           try:
 | 
						|
               c = sys.stdin.read(1)
 | 
						|
               print("Got character", repr(c))
 | 
						|
           except IOError:
 | 
						|
               pass
 | 
						|
   finally:
 | 
						|
       termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
 | 
						|
       fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
 | 
						|
 | 
						|
   You need the :mod:`termios` and the :mod:`fcntl` module for any of this to
 | 
						|
   work, and I've only tried it on Linux, though it should work elsewhere.  In
 | 
						|
   this code, characters are read and printed one at a time.
 | 
						|
 | 
						|
   :func:`termios.tcsetattr` turns off stdin's echoing and disables canonical
 | 
						|
   mode.  :func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags
 | 
						|
   and modify them for non-blocking mode.  Since reading stdin when it is empty
 | 
						|
   results in an :exc:`IOError`, this error is caught and ignored.
 | 
						|
 | 
						|
 | 
						|
Threads
 | 
						|
=======
 | 
						|
 | 
						|
How do I program using threads?
 | 
						|
-------------------------------
 | 
						|
 | 
						|
Be sure to use the :mod:`threading` module and not the :mod:`_thread` module.
 | 
						|
The :mod:`threading` module builds convenient abstractions on top of the
 | 
						|
low-level primitives provided by the :mod:`_thread` module.
 | 
						|
 | 
						|
Aahz has a set of slides from his threading tutorial that are helpful; see
 | 
						|
http://www.pythoncraft.com/OSCON2001/.
 | 
						|
 | 
						|
 | 
						|
None of my threads seem to run: why?
 | 
						|
------------------------------------
 | 
						|
 | 
						|
As soon as the main thread exits, all threads are killed.  Your main thread is
 | 
						|
running too quickly, giving the threads no time to do any work.
 | 
						|
 | 
						|
A simple fix is to add a sleep to the end of the program that's long enough for
 | 
						|
all the threads to finish::
 | 
						|
 | 
						|
   import threading, time
 | 
						|
 | 
						|
   def thread_task(name, n):
 | 
						|
       for i in range(n): print(name, i)
 | 
						|
 | 
						|
   for i in range(10):
 | 
						|
       T = threading.Thread(target=thread_task, args=(str(i), i))
 | 
						|
       T.start()
 | 
						|
 | 
						|
   time.sleep(10)  # <---------------------------!
 | 
						|
 | 
						|
But now (on many platforms) the threads don't run in parallel, but appear to run
 | 
						|
sequentially, one at a time!  The reason is that the OS thread scheduler doesn't
 | 
						|
start a new thread until the previous thread is blocked.
 | 
						|
 | 
						|
A simple fix is to add a tiny sleep to the start of the run function::
 | 
						|
 | 
						|
   def thread_task(name, n):
 | 
						|
       time.sleep(0.001)  # <--------------------!
 | 
						|
       for i in range(n): print(name, i)
 | 
						|
 | 
						|
   for i in range(10):
 | 
						|
       T = threading.Thread(target=thread_task, args=(str(i), i))
 | 
						|
       T.start()
 | 
						|
 | 
						|
   time.sleep(10)
 | 
						|
 | 
						|
Instead of trying to guess how long a :func:`time.sleep` delay will be enough,
 | 
						|
it's better to use some kind of semaphore mechanism.  One idea is to use the
 | 
						|
:mod:`queue` module to create a queue object, let each thread append a token to
 | 
						|
the queue when it finishes, and let the main thread read as many tokens from the
 | 
						|
queue as there are threads.
 | 
						|
 | 
						|
 | 
						|
How do I parcel out work among a bunch of worker threads?
 | 
						|
---------------------------------------------------------
 | 
						|
 | 
						|
The easiest way is to use the new :mod:`concurrent.futures` module,
 | 
						|
especially the :mod:`~concurrent.futures.ThreadPoolExecutor` class.
 | 
						|
 | 
						|
Or, if you want fine control over the dispatching algorithm, you can write
 | 
						|
your own logic manually.  Use the :mod:`queue` module to create a queue
 | 
						|
containing a list of jobs.  The :class:`~queue.Queue` class maintains a
 | 
						|
list of objects with ``.put(obj)`` to add an item to the queue and ``.get()``
 | 
						|
to return an item.  The class will take care of the locking necessary to
 | 
						|
ensure that each job is handed out exactly once.
 | 
						|
 | 
						|
Here's a trivial example::
 | 
						|
 | 
						|
   import threading, queue, time
 | 
						|
 | 
						|
   # The worker thread gets jobs off the queue.  When the queue is empty, it
 | 
						|
   # assumes there will be no more work and exits.
 | 
						|
   # (Realistically workers will run until terminated.)
 | 
						|
   def worker ():
 | 
						|
       print('Running worker')
 | 
						|
       time.sleep(0.1)
 | 
						|
       while True:
 | 
						|
           try:
 | 
						|
               arg = q.get(block=False)
 | 
						|
           except queue.Empty:
 | 
						|
               print('Worker', threading.currentThread(), end=' ')
 | 
						|
               print('queue empty')
 | 
						|
               break
 | 
						|
           else:
 | 
						|
               print('Worker', threading.currentThread(), end=' ')
 | 
						|
               print('running with argument', arg)
 | 
						|
               time.sleep(0.5)
 | 
						|
 | 
						|
   # Create queue
 | 
						|
   q = queue.Queue()
 | 
						|
 | 
						|
   # Start a pool of 5 workers
 | 
						|
   for i in range(5):
 | 
						|
       t = threading.Thread(target=worker, name='worker %i' % (i+1))
 | 
						|
       t.start()
 | 
						|
 | 
						|
   # Begin adding work to the queue
 | 
						|
   for i in range(50):
 | 
						|
       q.put(i)
 | 
						|
 | 
						|
   # Give threads time to run
 | 
						|
   print('Main thread sleeping')
 | 
						|
   time.sleep(5)
 | 
						|
 | 
						|
When run, this will produce the following output::
 | 
						|
 | 
						|
   Running worker
 | 
						|
   Running worker
 | 
						|
   Running worker
 | 
						|
   Running worker
 | 
						|
   Running worker
 | 
						|
   Main thread sleeping
 | 
						|
   Worker <Thread(worker 1, started 130283832797456)> running with argument 0
 | 
						|
   Worker <Thread(worker 2, started 130283824404752)> running with argument 1
 | 
						|
   Worker <Thread(worker 3, started 130283816012048)> running with argument 2
 | 
						|
   Worker <Thread(worker 4, started 130283807619344)> running with argument 3
 | 
						|
   Worker <Thread(worker 5, started 130283799226640)> running with argument 4
 | 
						|
   Worker <Thread(worker 1, started 130283832797456)> running with argument 5
 | 
						|
   ...
 | 
						|
 | 
						|
Consult the module's documentation for more details; the ``Queue`` class
 | 
						|
provides a featureful interface.
 | 
						|
 | 
						|
 | 
						|
What kinds of global value mutation are thread-safe?
 | 
						|
----------------------------------------------------
 | 
						|
 | 
						|
A :term:`global interpreter lock` (GIL) is used internally to ensure that only one
 | 
						|
thread runs in the Python VM at a time.  In general, Python offers to switch
 | 
						|
among threads only between bytecode instructions; how frequently it switches can
 | 
						|
be set via :func:`sys.setswitchinterval`.  Each bytecode instruction and
 | 
						|
therefore all the C implementation code reached from each instruction is
 | 
						|
therefore atomic from the point of view of a Python program.
 | 
						|
 | 
						|
In theory, this means an exact accounting requires an exact understanding of the
 | 
						|
PVM bytecode implementation.  In practice, it means that operations on shared
 | 
						|
variables of built-in data types (ints, lists, dicts, etc) that "look atomic"
 | 
						|
really are.
 | 
						|
 | 
						|
For example, the following operations are all atomic (L, L1, L2 are lists, D,
 | 
						|
D1, D2 are dicts, x, y are objects, i, j are ints)::
 | 
						|
 | 
						|
   L.append(x)
 | 
						|
   L1.extend(L2)
 | 
						|
   x = L[i]
 | 
						|
   x = L.pop()
 | 
						|
   L1[i:j] = L2
 | 
						|
   L.sort()
 | 
						|
   x = y
 | 
						|
   x.field = y
 | 
						|
   D[x] = y
 | 
						|
   D1.update(D2)
 | 
						|
   D.keys()
 | 
						|
 | 
						|
These aren't::
 | 
						|
 | 
						|
   i = i+1
 | 
						|
   L.append(L[-1])
 | 
						|
   L[i] = L[j]
 | 
						|
   D[x] = D[x] + 1
 | 
						|
 | 
						|
Operations that replace other objects may invoke those other objects'
 | 
						|
:meth:`__del__` method when their reference count reaches zero, and that can
 | 
						|
affect things.  This is especially true for the mass updates to dictionaries and
 | 
						|
lists.  When in doubt, use a mutex!
 | 
						|
 | 
						|
 | 
						|
Can't we get rid of the Global Interpreter Lock?
 | 
						|
------------------------------------------------
 | 
						|
 | 
						|
.. XXX link to dbeazley's talk about GIL?
 | 
						|
 | 
						|
The :term:`global interpreter lock` (GIL) is often seen as a hindrance to Python's
 | 
						|
deployment on high-end multiprocessor server machines, because a multi-threaded
 | 
						|
Python program effectively only uses one CPU, due to the insistence that
 | 
						|
(almost) all Python code can only run while the GIL is held.
 | 
						|
 | 
						|
Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
 | 
						|
patch set (the "free threading" patches) that removed the GIL and replaced it
 | 
						|
with fine-grained locking.  Adam Olsen recently did a similar experiment
 | 
						|
in his `python-safethread <http://code.google.com/p/python-safethread/>`_
 | 
						|
project.  Unfortunately, both experiments exhibited a sharp drop in single-thread
 | 
						|
performance (at least 30% slower), due to the amount of fine-grained locking
 | 
						|
necessary to compensate for the removal of the GIL.
 | 
						|
 | 
						|
This doesn't mean that you can't make good use of Python on multi-CPU machines!
 | 
						|
You just have to be creative with dividing the work up between multiple
 | 
						|
*processes* rather than multiple *threads*.  The
 | 
						|
:class:`~concurrent.futures.ProcessPoolExecutor` class in the new
 | 
						|
:mod:`concurrent.futures` module provides an easy way of doing so; the
 | 
						|
:mod:`multiprocessing` module provides a lower-level API in case you want
 | 
						|
more control over dispatching of tasks.
 | 
						|
 | 
						|
Judicious use of C extensions will also help; if you use a C extension to
 | 
						|
perform a time-consuming task, the extension can release the GIL while the
 | 
						|
thread of execution is in the C code and allow other threads to get some work
 | 
						|
done.  Some standard library modules such as :mod:`zlib` and :mod:`hashlib`
 | 
						|
already do this.
 | 
						|
 | 
						|
It has been suggested that the GIL should be a per-interpreter-state lock rather
 | 
						|
than truly global; interpreters then wouldn't be able to share objects.
 | 
						|
Unfortunately, this isn't likely to happen either.  It would be a tremendous
 | 
						|
amount of work, because many object implementations currently have global state.
 | 
						|
For example, small integers and short strings are cached; these caches would
 | 
						|
have to be moved to the interpreter state.  Other object types have their own
 | 
						|
free list; these free lists would have to be moved to the interpreter state.
 | 
						|
And so on.
 | 
						|
 | 
						|
And I doubt that it can even be done in finite time, because the same problem
 | 
						|
exists for 3rd party extensions.  It is likely that 3rd party extensions are
 | 
						|
being written at a faster rate than you can convert them to store all their
 | 
						|
global state in the interpreter state.
 | 
						|
 | 
						|
And finally, once you have multiple interpreters not sharing any state, what
 | 
						|
have you gained over running each interpreter in a separate process?
 | 
						|
 | 
						|
 | 
						|
Input and Output
 | 
						|
================
 | 
						|
 | 
						|
How do I delete a file? (And other file questions...)
 | 
						|
-----------------------------------------------------
 | 
						|
 | 
						|
Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
 | 
						|
the :mod:`os` module.  The two functions are identical; :func:`~os.unlink` is simply
 | 
						|
the name of the Unix system call for this function.
 | 
						|
 | 
						|
To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
 | 
						|
``os.makedirs(path)`` will create any intermediate directories in ``path`` that
 | 
						|
don't exist. ``os.removedirs(path)`` will remove intermediate directories as
 | 
						|
long as they're empty; if you want to delete an entire directory tree and its
 | 
						|
contents, use :func:`shutil.rmtree`.
 | 
						|
 | 
						|
To rename a file, use ``os.rename(old_path, new_path)``.
 | 
						|
 | 
						|
To truncate a file, open it using ``f = open(filename, "rb+")``, and use
 | 
						|
``f.truncate(offset)``; offset defaults to the current seek position.  There's
 | 
						|
also ``os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
 | 
						|
``fd`` is the file descriptor (a small integer).
 | 
						|
 | 
						|
The :mod:`shutil` module also contains a number of functions to work on files
 | 
						|
including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
 | 
						|
:func:`~shutil.rmtree`.
 | 
						|
 | 
						|
 | 
						|
How do I copy a file?
 | 
						|
---------------------
 | 
						|
 | 
						|
The :mod:`shutil` module contains a :func:`~shutil.copyfile` function.  Note
 | 
						|
that on MacOS 9 it doesn't copy the resource fork and Finder info.
 | 
						|
 | 
						|
 | 
						|
How do I read (or write) binary data?
 | 
						|
-------------------------------------
 | 
						|
 | 
						|
To read or write complex binary data formats, it's best to use the :mod:`struct`
 | 
						|
module.  It allows you to take a string containing binary data (usually numbers)
 | 
						|
and convert it to Python objects; and vice versa.
 | 
						|
 | 
						|
For example, the following code reads two 2-byte integers and one 4-byte integer
 | 
						|
in big-endian format from a file::
 | 
						|
 | 
						|
   import struct
 | 
						|
 | 
						|
   with open(filename, "rb") as f:
 | 
						|
      s = f.read(8)
 | 
						|
      x, y, z = struct.unpack(">hhl", s)
 | 
						|
 | 
						|
The '>' in the format string forces big-endian data; the letter 'h' reads one
 | 
						|
"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
 | 
						|
string.
 | 
						|
 | 
						|
For data that is more regular (e.g. a homogeneous list of ints or thefloats),
 | 
						|
you can also use the :mod:`array` module.
 | 
						|
 | 
						|
   .. note::
 | 
						|
      To read and write binary data, it is mandatory to open the file in
 | 
						|
      binary mode (here, passing ``"rb"`` to :func:`open`).  If you use
 | 
						|
      ``"r"`` instead (the default), the file will be open in text mode
 | 
						|
      and ``f.read()`` will return :class:`str` objects rather than
 | 
						|
      :class:`bytes` objects.
 | 
						|
 | 
						|
 | 
						|
I can't seem to use os.read() on a pipe created with os.popen(); why?
 | 
						|
---------------------------------------------------------------------
 | 
						|
 | 
						|
:func:`os.read` is a low-level function which takes a file descriptor, a small
 | 
						|
integer representing the opened file.  :func:`os.popen` creates a high-level
 | 
						|
file object, the same type returned by the built-in :func:`open` function.
 | 
						|
Thus, to read n bytes from a pipe p created with :func:`os.popen`, you need to
 | 
						|
use ``p.read(n)``.
 | 
						|
 | 
						|
 | 
						|
.. XXX update to use subprocess. See the :ref:`subprocess-replacements` section.
 | 
						|
 | 
						|
   How do I run a subprocess with pipes connected to both input and output?
 | 
						|
   ------------------------------------------------------------------------
 | 
						|
 | 
						|
   Use the :mod:`popen2` module.  For example::
 | 
						|
 | 
						|
      import popen2
 | 
						|
      fromchild, tochild = popen2.popen2("command")
 | 
						|
      tochild.write("input\n")
 | 
						|
      tochild.flush()
 | 
						|
      output = fromchild.readline()
 | 
						|
 | 
						|
   Warning: in general it is unwise to do this because you can easily cause a
 | 
						|
   deadlock where your process is blocked waiting for output from the child
 | 
						|
   while the child is blocked waiting for input from you.  This can be caused
 | 
						|
   because the parent expects the child to output more text than it does, or it
 | 
						|
   can be caused by data being stuck in stdio buffers due to lack of flushing.
 | 
						|
   The Python parent can of course explicitly flush the data it sends to the
 | 
						|
   child before it reads any output, but if the child is a naive C program it
 | 
						|
   may have been written to never explicitly flush its output, even if it is
 | 
						|
   interactive, since flushing is normally automatic.
 | 
						|
 | 
						|
   Note that a deadlock is also possible if you use :func:`popen3` to read
 | 
						|
   stdout and stderr. If one of the two is too large for the internal buffer
 | 
						|
   (increasing the buffer size does not help) and you ``read()`` the other one
 | 
						|
   first, there is a deadlock, too.
 | 
						|
 | 
						|
   Note on a bug in popen2: unless your program calls ``wait()`` or
 | 
						|
   ``waitpid()``, finished child processes are never removed, and eventually
 | 
						|
   calls to popen2 will fail because of a limit on the number of child
 | 
						|
   processes.  Calling :func:`os.waitpid` with the :data:`os.WNOHANG` option can
 | 
						|
   prevent this; a good place to insert such a call would be before calling
 | 
						|
   ``popen2`` again.
 | 
						|
 | 
						|
   In many cases, all you really need is to run some data through a command and
 | 
						|
   get the result back.  Unless the amount of data is very large, the easiest
 | 
						|
   way to do this is to write it to a temporary file and run the command with
 | 
						|
   that temporary file as input.  The standard module :mod:`tempfile` exports a
 | 
						|
   ``mktemp()`` function to generate unique temporary file names. ::
 | 
						|
 | 
						|
      import tempfile
 | 
						|
      import os
 | 
						|
 | 
						|
      class Popen3:
 | 
						|
          """
 | 
						|
          This is a deadlock-safe version of popen that returns
 | 
						|
          an object with errorlevel, out (a string) and err (a string).
 | 
						|
          (capturestderr may not work under windows.)
 | 
						|
          Example: print(Popen3('grep spam','\n\nhere spam\n\n').out)
 | 
						|
          """
 | 
						|
          def __init__(self,command,input=None,capturestderr=None):
 | 
						|
              outfile=tempfile.mktemp()
 | 
						|
              command="( %s ) > %s" % (command,outfile)
 | 
						|
              if input:
 | 
						|
                  infile=tempfile.mktemp()
 | 
						|
                  open(infile,"w").write(input)
 | 
						|
                  command=command+" <"+infile
 | 
						|
              if capturestderr:
 | 
						|
                  errfile=tempfile.mktemp()
 | 
						|
                  command=command+" 2>"+errfile
 | 
						|
              self.errorlevel=os.system(command) >> 8
 | 
						|
              self.out=open(outfile,"r").read()
 | 
						|
              os.remove(outfile)
 | 
						|
              if input:
 | 
						|
                  os.remove(infile)
 | 
						|
              if capturestderr:
 | 
						|
                  self.err=open(errfile,"r").read()
 | 
						|
                  os.remove(errfile)
 | 
						|
 | 
						|
   Note that many interactive programs (e.g. vi) don't work well with pipes
 | 
						|
   substituted for standard input and output.  You will have to use pseudo ttys
 | 
						|
   ("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
 | 
						|
   "expect" library.  A Python extension that interfaces to expect is called
 | 
						|
   "expy" and available from http://expectpy.sourceforge.net.  A pure Python
 | 
						|
   solution that works like expect is `pexpect
 | 
						|
   <http://pypi.python.org/pypi/pexpect/>`_.
 | 
						|
 | 
						|
 | 
						|
How do I access the serial (RS232) port?
 | 
						|
----------------------------------------
 | 
						|
 | 
						|
For Win32, POSIX (Linux, BSD, etc.), Jython:
 | 
						|
 | 
						|
   http://pyserial.sourceforge.net
 | 
						|
 | 
						|
For Unix, see a Usenet post by Mitch Chapman:
 | 
						|
 | 
						|
   http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
 | 
						|
 | 
						|
 | 
						|
Why doesn't closing sys.stdout (stdin, stderr) really close it?
 | 
						|
---------------------------------------------------------------
 | 
						|
 | 
						|
Python :term:`file objects <file object>` are a high-level layer of
 | 
						|
abstraction on low-level C file descriptors.
 | 
						|
 | 
						|
For most file objects you create in Python via the built-in :func:`open`
 | 
						|
function, ``f.close()`` marks the Python file object as being closed from
 | 
						|
Python's point of view, and also arranges to close the underlying C file
 | 
						|
descriptor.  This also happens automatically in ``f``'s destructor, when
 | 
						|
``f`` becomes garbage.
 | 
						|
 | 
						|
But stdin, stdout and stderr are treated specially by Python, because of the
 | 
						|
special status also given to them by C.  Running ``sys.stdout.close()`` marks
 | 
						|
the Python-level file object as being closed, but does *not* close the
 | 
						|
associated C file descriptor.
 | 
						|
 | 
						|
To close the underlying C file descriptor for one of these three, you should
 | 
						|
first be sure that's what you really want to do (e.g., you may confuse
 | 
						|
extension modules trying to do I/O).  If it is, use :func:`os.close`::
 | 
						|
 | 
						|
   os.close(stdin.fileno())
 | 
						|
   os.close(stdout.fileno())
 | 
						|
   os.close(stderr.fileno())
 | 
						|
 | 
						|
Or you can use the numeric constants 0, 1 and 2, respectively.
 | 
						|
 | 
						|
 | 
						|
Network/Internet Programming
 | 
						|
============================
 | 
						|
 | 
						|
What WWW tools are there for Python?
 | 
						|
------------------------------------
 | 
						|
 | 
						|
See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
 | 
						|
Reference Manual.  Python has many modules that will help you build server-side
 | 
						|
and client-side web systems.
 | 
						|
 | 
						|
.. XXX check if wiki page is still up to date
 | 
						|
 | 
						|
A summary of available frameworks is maintained by Paul Boddie at
 | 
						|
http://wiki.python.org/moin/WebProgramming .
 | 
						|
 | 
						|
Cameron Laird maintains a useful set of pages about Python web technologies at
 | 
						|
http://phaseit.net/claird/comp.lang.python/web_python.
 | 
						|
 | 
						|
 | 
						|
How can I mimic CGI form submission (METHOD=POST)?
 | 
						|
--------------------------------------------------
 | 
						|
 | 
						|
I would like to retrieve web pages that are the result of POSTing a form. Is
 | 
						|
there existing code that would let me do this easily?
 | 
						|
 | 
						|
Yes. Here's a simple example that uses urllib.request::
 | 
						|
 | 
						|
   #!/usr/local/bin/python
 | 
						|
 | 
						|
   import urllib.request
 | 
						|
 | 
						|
   ### build the query string
 | 
						|
   qs = "First=Josephine&MI=Q&Last=Public"
 | 
						|
 | 
						|
   ### connect and send the server a path
 | 
						|
   req = urllib.request.urlopen('http://www.some-server.out-there'
 | 
						|
                                '/cgi-bin/some-cgi-script', data=qs)
 | 
						|
   msg, hdrs = req.read(), req.info()
 | 
						|
 | 
						|
Note that in general for percent-encoded POST operations, query strings must be
 | 
						|
quoted using :func:`urllib.parse.urlencode`.  For example to send name="Guy Steele,
 | 
						|
Jr."::
 | 
						|
 | 
						|
   >>> import urllib.parse
 | 
						|
   >>> urllib.parse.urlencode({'name': 'Guy Steele, Jr.'})
 | 
						|
   'name=Guy+Steele%2C+Jr.'
 | 
						|
 | 
						|
.. seealso:: :ref:`urllib-howto` for extensive examples.
 | 
						|
 | 
						|
 | 
						|
What module should I use to help with generating HTML?
 | 
						|
------------------------------------------------------
 | 
						|
 | 
						|
.. XXX add modern template languages
 | 
						|
 | 
						|
There are many different modules available:
 | 
						|
 | 
						|
* HTMLgen is a class library of objects corresponding to all the HTML 3.2 markup
 | 
						|
  tags. It's used when you are writing in Python and wish to synthesize HTML
 | 
						|
  pages for generating a web or for CGI forms, etc.
 | 
						|
 | 
						|
* DocumentTemplate and Zope Page Templates are two different systems that are
 | 
						|
  part of Zope.
 | 
						|
 | 
						|
* Quixote's PTL uses Python syntax to assemble strings of text.
 | 
						|
 | 
						|
Consult the `Web Programming wiki pages
 | 
						|
<http://wiki.python.org/moin/WebProgramming>`_ for more links.
 | 
						|
 | 
						|
 | 
						|
How do I send mail from a Python script?
 | 
						|
----------------------------------------
 | 
						|
 | 
						|
Use the standard library module :mod:`smtplib`.
 | 
						|
 | 
						|
Here's a very simple interactive mail sender that uses it.  This method will
 | 
						|
work on any host that supports an SMTP listener. ::
 | 
						|
 | 
						|
   import sys, smtplib
 | 
						|
 | 
						|
   fromaddr = input("From: ")
 | 
						|
   toaddrs  = input("To: ").split(',')
 | 
						|
   print("Enter message, end with ^D:")
 | 
						|
   msg = ''
 | 
						|
   while True:
 | 
						|
       line = sys.stdin.readline()
 | 
						|
       if not line:
 | 
						|
           break
 | 
						|
       msg += line
 | 
						|
 | 
						|
   # The actual mail send
 | 
						|
   server = smtplib.SMTP('localhost')
 | 
						|
   server.sendmail(fromaddr, toaddrs, msg)
 | 
						|
   server.quit()
 | 
						|
 | 
						|
A Unix-only alternative uses sendmail.  The location of the sendmail program
 | 
						|
varies between systems; sometimes it is ``/usr/lib/sendmail``, sometime
 | 
						|
``/usr/sbin/sendmail``.  The sendmail manual page will help you out.  Here's
 | 
						|
some sample code::
 | 
						|
 | 
						|
   SENDMAIL = "/usr/sbin/sendmail"  # sendmail location
 | 
						|
   import os
 | 
						|
   p = os.popen("%s -t -i" % SENDMAIL, "w")
 | 
						|
   p.write("To: receiver@example.com\n")
 | 
						|
   p.write("Subject: test\n")
 | 
						|
   p.write("\n")  # blank line separating headers from body
 | 
						|
   p.write("Some text\n")
 | 
						|
   p.write("some more text\n")
 | 
						|
   sts = p.close()
 | 
						|
   if sts != 0:
 | 
						|
       print("Sendmail exit status", sts)
 | 
						|
 | 
						|
 | 
						|
How do I avoid blocking in the connect() method of a socket?
 | 
						|
------------------------------------------------------------
 | 
						|
 | 
						|
The :mod:`select` module is commonly used to help with asynchronous I/O on
 | 
						|
sockets.
 | 
						|
 | 
						|
To prevent the TCP connect from blocking, you can set the socket to non-blocking
 | 
						|
mode.  Then when you do the ``connect()``, you will either connect immediately
 | 
						|
(unlikely) or get an exception that contains the error number as ``.errno``.
 | 
						|
``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
 | 
						|
finished yet.  Different OSes will return different values, so you're going to
 | 
						|
have to check what's returned on your system.
 | 
						|
 | 
						|
You can use the ``connect_ex()`` method to avoid creating an exception.  It will
 | 
						|
just return the errno value.  To poll, you can call ``connect_ex()`` again later
 | 
						|
-- ``0`` or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
 | 
						|
socket to select to check if it's writable.
 | 
						|
 | 
						|
.. note::
 | 
						|
   The :mod:`asyncore` module presents a framework-like approach to the problem
 | 
						|
   of writing non-blocking networking code.
 | 
						|
   The third-party `Twisted <http://twistedmatrix.com/>`_ library is
 | 
						|
   a popular and feature-rich alternative.
 | 
						|
 | 
						|
 | 
						|
Databases
 | 
						|
=========
 | 
						|
 | 
						|
Are there any interfaces to database packages in Python?
 | 
						|
--------------------------------------------------------
 | 
						|
 | 
						|
Yes.
 | 
						|
 | 
						|
Interfaces to disk-based hashes such as :mod:`DBM <dbm.ndbm>` and :mod:`GDBM
 | 
						|
<dbm.gnu>` are also included with standard Python.  There is also the
 | 
						|
:mod:`sqlite3` module, which provides a lightweight disk-based relational
 | 
						|
database.
 | 
						|
 | 
						|
Support for most relational databases is available.  See the
 | 
						|
`DatabaseProgramming wiki page
 | 
						|
<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
 | 
						|
 | 
						|
 | 
						|
How do you implement persistent objects in Python?
 | 
						|
--------------------------------------------------
 | 
						|
 | 
						|
The :mod:`pickle` library module solves this in a very general way (though you
 | 
						|
still can't store things like open files, sockets or windows), and the
 | 
						|
:mod:`shelve` library module uses pickle and (g)dbm to create persistent
 | 
						|
mappings containing arbitrary Python objects.
 | 
						|
 | 
						|
A more awkward way of doing things is to use pickle's little sister, marshal.
 | 
						|
The :mod:`marshal` module provides very fast ways to store noncircular basic
 | 
						|
Python types to files and strings, and back again.  Although marshal does not do
 | 
						|
fancy things like store instances or handle shared references properly, it does
 | 
						|
run extremely fast.  For example loading a half megabyte of data may take less
 | 
						|
than a third of a second.  This often beats doing something more complex and
 | 
						|
general such as using gdbm with pickle/shelve.
 | 
						|
 | 
						|
 | 
						|
If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
 | 
						|
------------------------------------------------------------------------------------------
 | 
						|
 | 
						|
.. XXX move this FAQ entry elsewhere?
 | 
						|
 | 
						|
.. note::
 | 
						|
 | 
						|
   The bsddb module is now available as a standalone package `pybsddb
 | 
						|
   <http://www.jcea.es/programacion/pybsddb.htm>`_.
 | 
						|
 | 
						|
Databases opened for write access with the bsddb module (and often by the anydbm
 | 
						|
module, since it will preferentially use bsddb) must explicitly be closed using
 | 
						|
the ``.close()`` method of the database.  The underlying library caches database
 | 
						|
contents which need to be converted to on-disk form and written.
 | 
						|
 | 
						|
If you have initialized a new bsddb database but not written anything to it
 | 
						|
before the program crashes, you will often wind up with a zero-length file and
 | 
						|
encounter an exception the next time the file is opened.
 | 
						|
 | 
						|
 | 
						|
I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
 | 
						|
----------------------------------------------------------------------------------------------------------------------------
 | 
						|
 | 
						|
.. XXX move this FAQ entry elsewhere?
 | 
						|
 | 
						|
.. note::
 | 
						|
 | 
						|
   The bsddb module is now available as a standalone package `pybsddb
 | 
						|
   <http://www.jcea.es/programacion/pybsddb.htm>`_.
 | 
						|
 | 
						|
Don't panic! Your data is probably intact. The most frequent cause for the error
 | 
						|
is that you tried to open an earlier Berkeley DB file with a later version of
 | 
						|
the Berkeley DB library.
 | 
						|
 | 
						|
Many Linux systems now have all three versions of Berkeley DB available.  If you
 | 
						|
are migrating from version 1 to a newer version use db_dump185 to dump a plain
 | 
						|
text version of the database.  If you are migrating from version 2 to version 3
 | 
						|
use db2_dump to create a plain text version of the database.  In either case,
 | 
						|
use db_load to create a new native database for the latest version installed on
 | 
						|
your computer.  If you have version 3 of Berkeley DB installed, you should be
 | 
						|
able to use db2_load to create a native version 2 database.
 | 
						|
 | 
						|
You should move away from Berkeley DB version 1 files because the hash file code
 | 
						|
contains known bugs that can corrupt your data.
 | 
						|
 | 
						|
 | 
						|
Mathematics and Numerics
 | 
						|
========================
 | 
						|
 | 
						|
How do I generate random numbers in Python?
 | 
						|
-------------------------------------------
 | 
						|
 | 
						|
The standard module :mod:`random` implements a random number generator.  Usage
 | 
						|
is simple::
 | 
						|
 | 
						|
   import random
 | 
						|
   random.random()
 | 
						|
 | 
						|
This returns a random floating point number in the range [0, 1).
 | 
						|
 | 
						|
There are also many other specialized generators in this module, such as:
 | 
						|
 | 
						|
* ``randrange(a, b)`` chooses an integer in the range [a, b).
 | 
						|
* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
 | 
						|
* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
 | 
						|
 | 
						|
Some higher-level functions operate on sequences directly, such as:
 | 
						|
 | 
						|
* ``choice(S)`` chooses random element from a given sequence
 | 
						|
* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
 | 
						|
 | 
						|
There's also a ``Random`` class you can instantiate to create independent
 | 
						|
multiple random number generators.
 |