mirror of
				https://github.com/python/cpython.git
				synced 2025-11-04 03:44:55 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			569 lines
		
	
	
	
		
			25 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
			
		
		
	
	
			569 lines
		
	
	
	
		
			25 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
\chapter{Introduction \label{intro}}
 | 
						|
 | 
						|
 | 
						|
The Application Programmer's Interface to Python gives C and
 | 
						|
\Cpp{} programmers access to the Python interpreter at a variety of
 | 
						|
levels.  The API is equally usable from \Cpp, but for brevity it is
 | 
						|
generally referred to as the Python/C API.  There are two
 | 
						|
fundamentally different reasons for using the Python/C API.  The first
 | 
						|
reason is to write \emph{extension modules} for specific purposes;
 | 
						|
these are C modules that extend the Python interpreter.  This is
 | 
						|
probably the most common use.  The second reason is to use Python as a
 | 
						|
component in a larger application; this technique is generally
 | 
						|
referred to as \dfn{embedding} Python in an application.
 | 
						|
 | 
						|
Writing an extension module is a relatively well-understood process, 
 | 
						|
where a ``cookbook'' approach works well.  There are several tools 
 | 
						|
that automate the process to some extent.  While people have embedded 
 | 
						|
Python in other applications since its early existence, the process of 
 | 
						|
embedding Python is less straightforward than writing an extension.  
 | 
						|
 | 
						|
Many API functions are useful independent of whether you're embedding 
 | 
						|
or extending Python; moreover, most applications that embed Python 
 | 
						|
will need to provide a custom extension as well, so it's probably a 
 | 
						|
good idea to become familiar with writing an extension before 
 | 
						|
attempting to embed Python in a real application.
 | 
						|
 | 
						|
 | 
						|
\section{Include Files \label{includes}}
 | 
						|
 | 
						|
All function, type and macro definitions needed to use the Python/C
 | 
						|
API are included in your code by the following line:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
#include "Python.h"
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
This implies inclusion of the following standard headers:
 | 
						|
\code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>},
 | 
						|
\code{<limits.h>}, and \code{<stdlib.h>} (if available).
 | 
						|
 | 
						|
\begin{notice}[warning]
 | 
						|
  Since Python may define some pre-processor definitions which affect
 | 
						|
  the standard headers on some systems, you \emph{must} include
 | 
						|
  \file{Python.h} before any standard headers are included.
 | 
						|
\end{notice}
 | 
						|
 | 
						|
All user visible names defined by Python.h (except those defined by
 | 
						|
the included standard headers) have one of the prefixes \samp{Py} or
 | 
						|
\samp{_Py}.  Names beginning with \samp{_Py} are for internal use by
 | 
						|
the Python implementation and should not be used by extension writers.
 | 
						|
Structure member names do not have a reserved prefix.
 | 
						|
 | 
						|
\strong{Important:} user code should never define names that begin
 | 
						|
with \samp{Py} or \samp{_Py}.  This confuses the reader, and
 | 
						|
jeopardizes the portability of the user code to future Python
 | 
						|
versions, which may define additional names beginning with one of
 | 
						|
these prefixes.
 | 
						|
 | 
						|
The header files are typically installed with Python.  On \UNIX, these 
 | 
						|
are located in the directories
 | 
						|
\file{\envvar{prefix}/include/python\var{version}/} and
 | 
						|
\file{\envvar{exec_prefix}/include/python\var{version}/}, where
 | 
						|
\envvar{prefix} and \envvar{exec_prefix} are defined by the
 | 
						|
corresponding parameters to Python's \program{configure} script and
 | 
						|
\var{version} is \code{sys.version[:3]}.  On Windows, the headers are
 | 
						|
installed in \file{\envvar{prefix}/include}, where \envvar{prefix} is
 | 
						|
the installation directory specified to the installer.
 | 
						|
 | 
						|
To include the headers, place both directories (if different) on your
 | 
						|
compiler's search path for includes.  Do \emph{not} place the parent
 | 
						|
directories on the search path and then use
 | 
						|
\samp{\#include <python\shortversion/Python.h>}; this will break on
 | 
						|
multi-platform builds since the platform independent headers under
 | 
						|
\envvar{prefix} include the platform specific headers from
 | 
						|
\envvar{exec_prefix}.
 | 
						|
 | 
						|
\Cpp{} users should note that though the API is defined entirely using
 | 
						|
C, the header files do properly declare the entry points to be
 | 
						|
\code{extern "C"}, so there is no need to do anything special to use
 | 
						|
the API from \Cpp.
 | 
						|
 | 
						|
 | 
						|
\section{Objects, Types and Reference Counts \label{objects}}
 | 
						|
 | 
						|
Most Python/C API functions have one or more arguments as well as a
 | 
						|
return value of type \ctype{PyObject*}.  This type is a pointer
 | 
						|
to an opaque data type representing an arbitrary Python
 | 
						|
object.  Since all Python object types are treated the same way by the
 | 
						|
Python language in most situations (e.g., assignments, scope rules,
 | 
						|
and argument passing), it is only fitting that they should be
 | 
						|
represented by a single C type.  Almost all Python objects live on the
 | 
						|
heap: you never declare an automatic or static variable of type
 | 
						|
\ctype{PyObject}, only pointer variables of type \ctype{PyObject*} can 
 | 
						|
be declared.  The sole exception are the type objects\obindex{type};
 | 
						|
since these must never be deallocated, they are typically static
 | 
						|
\ctype{PyTypeObject} objects.
 | 
						|
 | 
						|
All Python objects (even Python integers) have a \dfn{type} and a
 | 
						|
\dfn{reference count}.  An object's type determines what kind of object 
 | 
						|
it is (e.g., an integer, a list, or a user-defined function; there are 
 | 
						|
many more as explained in the \citetitle[../ref/ref.html]{Python
 | 
						|
Reference Manual}).  For each of the well-known types there is a macro
 | 
						|
to check whether an object is of that type; for instance,
 | 
						|
\samp{PyList_Check(\var{a})} is true if (and only if) the object
 | 
						|
pointed to by \var{a} is a Python list.
 | 
						|
 | 
						|
 | 
						|
\subsection{Reference Counts \label{refcounts}}
 | 
						|
 | 
						|
The reference count is important because today's computers have a 
 | 
						|
finite (and often severely limited) memory size; it counts how many 
 | 
						|
different places there are that have a reference to an object.  Such a 
 | 
						|
place could be another object, or a global (or static) C variable, or 
 | 
						|
a local variable in some C function.  When an object's reference count 
 | 
						|
becomes zero, the object is deallocated.  If it contains references to 
 | 
						|
other objects, their reference count is decremented.  Those other 
 | 
						|
objects may be deallocated in turn, if this decrement makes their 
 | 
						|
reference count become zero, and so on.  (There's an obvious problem 
 | 
						|
with objects that reference each other here; for now, the solution is 
 | 
						|
``don't do that.'')
 | 
						|
 | 
						|
Reference counts are always manipulated explicitly.  The normal way is 
 | 
						|
to use the macro \cfunction{Py_INCREF()}\ttindex{Py_INCREF()} to
 | 
						|
increment an object's reference count by one, and
 | 
						|
\cfunction{Py_DECREF()}\ttindex{Py_DECREF()} to decrement it by  
 | 
						|
one.  The \cfunction{Py_DECREF()} macro is considerably more complex
 | 
						|
than the incref one, since it must check whether the reference count
 | 
						|
becomes zero and then cause the object's deallocator to be called.
 | 
						|
The deallocator is a function pointer contained in the object's type
 | 
						|
structure.  The type-specific deallocator takes care of decrementing
 | 
						|
the reference counts for other objects contained in the object if this
 | 
						|
is a compound object type, such as a list, as well as performing any
 | 
						|
additional finalization that's needed.  There's no chance that the
 | 
						|
reference count can overflow; at least as many bits are used to hold
 | 
						|
the reference count as there are distinct memory locations in virtual
 | 
						|
memory (assuming \code{sizeof(long) >= sizeof(char*)}).  Thus, the
 | 
						|
reference count increment is a simple operation.
 | 
						|
 | 
						|
It is not necessary to increment an object's reference count for every 
 | 
						|
local variable that contains a pointer to an object.  In theory, the 
 | 
						|
object's reference count goes up by one when the variable is made to 
 | 
						|
point to it and it goes down by one when the variable goes out of 
 | 
						|
scope.  However, these two cancel each other out, so at the end the 
 | 
						|
reference count hasn't changed.  The only real reason to use the 
 | 
						|
reference count is to prevent the object from being deallocated as 
 | 
						|
long as our variable is pointing to it.  If we know that there is at 
 | 
						|
least one other reference to the object that lives at least as long as 
 | 
						|
our variable, there is no need to increment the reference count 
 | 
						|
temporarily.  An important situation where this arises is in objects 
 | 
						|
that are passed as arguments to C functions in an extension module 
 | 
						|
that are called from Python; the call mechanism guarantees to hold a 
 | 
						|
reference to every argument for the duration of the call.
 | 
						|
 | 
						|
However, a common pitfall is to extract an object from a list and
 | 
						|
hold on to it for a while without incrementing its reference count.
 | 
						|
Some other operation might conceivably remove the object from the
 | 
						|
list, decrementing its reference count and possible deallocating it.
 | 
						|
The real danger is that innocent-looking operations may invoke
 | 
						|
arbitrary Python code which could do this; there is a code path which
 | 
						|
allows control to flow back to the user from a \cfunction{Py_DECREF()},
 | 
						|
so almost any operation is potentially dangerous.
 | 
						|
 | 
						|
A safe approach is to always use the generic operations (functions 
 | 
						|
whose name begins with \samp{PyObject_}, \samp{PyNumber_},
 | 
						|
\samp{PySequence_} or \samp{PyMapping_}).  These operations always
 | 
						|
increment the reference count of the object they return.  This leaves
 | 
						|
the caller with the responsibility to call
 | 
						|
\cfunction{Py_DECREF()} when they are done with the result; this soon
 | 
						|
becomes second nature.
 | 
						|
 | 
						|
 | 
						|
\subsubsection{Reference Count Details \label{refcountDetails}}
 | 
						|
 | 
						|
The reference count behavior of functions in the Python/C API is best 
 | 
						|
explained in terms of \emph{ownership of references}.  Ownership
 | 
						|
pertains to references, never to objects (objects are not owned: they
 | 
						|
are always shared).  "Owning a reference" means being responsible for
 | 
						|
calling Py_DECREF on it when the reference is no longer needed. 
 | 
						|
Ownership can also be transferred, meaning that the code that receives
 | 
						|
ownership of the reference then becomes responsible for eventually
 | 
						|
decref'ing it by calling \cfunction{Py_DECREF()} or
 | 
						|
\cfunction{Py_XDECREF()} when it's no longer needed --or passing on
 | 
						|
this responsibility (usually to its caller).
 | 
						|
When a function passes ownership of a reference on to its caller, the
 | 
						|
caller is said to receive a \emph{new} reference.  When no ownership
 | 
						|
is transferred, the caller is said to \emph{borrow} the reference.
 | 
						|
Nothing needs to be done for a borrowed reference.
 | 
						|
 | 
						|
Conversely, when a calling function passes it a reference to an 
 | 
						|
object, there are two possibilities: the function \emph{steals} a 
 | 
						|
reference to the object, or it does not.  Few functions steal 
 | 
						|
references; the two notable exceptions are
 | 
						|
\cfunction{PyList_SetItem()}\ttindex{PyList_SetItem()} and
 | 
						|
\cfunction{PyTuple_SetItem()}\ttindex{PyTuple_SetItem()}, which 
 | 
						|
steal a reference to the item (but not to the tuple or list into which
 | 
						|
the item is put!).  These functions were designed to steal a reference
 | 
						|
because of a common idiom for populating a tuple or list with newly
 | 
						|
created objects; for example, the code to create the tuple \code{(1,
 | 
						|
2, "three")} could look like this (forgetting about error handling for
 | 
						|
the moment; a better way to code this is shown below):
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
PyObject *t;
 | 
						|
 | 
						|
t = PyTuple_New(3);
 | 
						|
PyTuple_SetItem(t, 0, PyInt_FromLong(1L));
 | 
						|
PyTuple_SetItem(t, 1, PyInt_FromLong(2L));
 | 
						|
PyTuple_SetItem(t, 2, PyString_FromString("three"));
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
Incidentally, \cfunction{PyTuple_SetItem()} is the \emph{only} way to
 | 
						|
set tuple items; \cfunction{PySequence_SetItem()} and
 | 
						|
\cfunction{PyObject_SetItem()} refuse to do this since tuples are an
 | 
						|
immutable data type.  You should only use
 | 
						|
\cfunction{PyTuple_SetItem()} for tuples that you are creating
 | 
						|
yourself.
 | 
						|
 | 
						|
Equivalent code for populating a list can be written using 
 | 
						|
\cfunction{PyList_New()} and \cfunction{PyList_SetItem()}.  Such code
 | 
						|
can also use \cfunction{PySequence_SetItem()}; this illustrates the
 | 
						|
difference between the two (the extra \cfunction{Py_DECREF()} calls):
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
PyObject *l, *x;
 | 
						|
 | 
						|
l = PyList_New(3);
 | 
						|
x = PyInt_FromLong(1L);
 | 
						|
PySequence_SetItem(l, 0, x); Py_DECREF(x);
 | 
						|
x = PyInt_FromLong(2L);
 | 
						|
PySequence_SetItem(l, 1, x); Py_DECREF(x);
 | 
						|
x = PyString_FromString("three");
 | 
						|
PySequence_SetItem(l, 2, x); Py_DECREF(x);
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
You might find it strange that the ``recommended'' approach takes more
 | 
						|
code.  However, in practice, you will rarely use these ways of
 | 
						|
creating and populating a tuple or list.  There's a generic function,
 | 
						|
\cfunction{Py_BuildValue()}, that can create most common objects from
 | 
						|
C values, directed by a \dfn{format string}.  For example, the
 | 
						|
above two blocks of code could be replaced by the following (which
 | 
						|
also takes care of the error checking):
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
PyObject *t, *l;
 | 
						|
 | 
						|
t = Py_BuildValue("(iis)", 1, 2, "three");
 | 
						|
l = Py_BuildValue("[iis]", 1, 2, "three");
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
It is much more common to use \cfunction{PyObject_SetItem()} and
 | 
						|
friends with items whose references you are only borrowing, like
 | 
						|
arguments that were passed in to the function you are writing.  In
 | 
						|
that case, their behaviour regarding reference counts is much saner,
 | 
						|
since you don't have to increment a reference count so you can give a
 | 
						|
reference away (``have it be stolen'').  For example, this function
 | 
						|
sets all items of a list (actually, any mutable sequence) to a given
 | 
						|
item:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
int
 | 
						|
set_all(PyObject *target, PyObject *item)
 | 
						|
{
 | 
						|
    int i, n;
 | 
						|
 | 
						|
    n = PyObject_Length(target);
 | 
						|
    if (n < 0)
 | 
						|
        return -1;
 | 
						|
    for (i = 0; i < n; i++) {
 | 
						|
        if (PyObject_SetItem(target, i, item) < 0)
 | 
						|
            return -1;
 | 
						|
    }
 | 
						|
    return 0;
 | 
						|
}
 | 
						|
\end{verbatim}
 | 
						|
\ttindex{set_all()}
 | 
						|
 | 
						|
The situation is slightly different for function return values.  
 | 
						|
While passing a reference to most functions does not change your 
 | 
						|
ownership responsibilities for that reference, many functions that 
 | 
						|
return a referece to an object give you ownership of the reference.
 | 
						|
The reason is simple: in many cases, the returned object is created 
 | 
						|
on the fly, and the reference you get is the only reference to the 
 | 
						|
object.  Therefore, the generic functions that return object 
 | 
						|
references, like \cfunction{PyObject_GetItem()} and 
 | 
						|
\cfunction{PySequence_GetItem()}, always return a new reference (the
 | 
						|
caller becomes the owner of the reference).
 | 
						|
 | 
						|
It is important to realize that whether you own a reference returned 
 | 
						|
by a function depends on which function you call only --- \emph{the
 | 
						|
plumage} (the type of the object passed as an
 | 
						|
argument to the function) \emph{doesn't enter into it!}  Thus, if you 
 | 
						|
extract an item from a list using \cfunction{PyList_GetItem()}, you
 | 
						|
don't own the reference --- but if you obtain the same item from the
 | 
						|
same list using \cfunction{PySequence_GetItem()} (which happens to
 | 
						|
take exactly the same arguments), you do own a reference to the
 | 
						|
returned object.
 | 
						|
 | 
						|
Here is an example of how you could write a function that computes the
 | 
						|
sum of the items in a list of integers; once using 
 | 
						|
\cfunction{PyList_GetItem()}\ttindex{PyList_GetItem()}, and once using
 | 
						|
\cfunction{PySequence_GetItem()}\ttindex{PySequence_GetItem()}.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
long
 | 
						|
sum_list(PyObject *list)
 | 
						|
{
 | 
						|
    int i, n;
 | 
						|
    long total = 0;
 | 
						|
    PyObject *item;
 | 
						|
 | 
						|
    n = PyList_Size(list);
 | 
						|
    if (n < 0)
 | 
						|
        return -1; /* Not a list */
 | 
						|
    for (i = 0; i < n; i++) {
 | 
						|
        item = PyList_GetItem(list, i); /* Can't fail */
 | 
						|
        if (!PyInt_Check(item)) continue; /* Skip non-integers */
 | 
						|
        total += PyInt_AsLong(item);
 | 
						|
    }
 | 
						|
    return total;
 | 
						|
}
 | 
						|
\end{verbatim}
 | 
						|
\ttindex{sum_list()}
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
long
 | 
						|
sum_sequence(PyObject *sequence)
 | 
						|
{
 | 
						|
    int i, n;
 | 
						|
    long total = 0;
 | 
						|
    PyObject *item;
 | 
						|
    n = PySequence_Length(sequence);
 | 
						|
    if (n < 0)
 | 
						|
        return -1; /* Has no length */
 | 
						|
    for (i = 0; i < n; i++) {
 | 
						|
        item = PySequence_GetItem(sequence, i);
 | 
						|
        if (item == NULL)
 | 
						|
            return -1; /* Not a sequence, or other failure */
 | 
						|
        if (PyInt_Check(item))
 | 
						|
            total += PyInt_AsLong(item);
 | 
						|
        Py_DECREF(item); /* Discard reference ownership */
 | 
						|
    }
 | 
						|
    return total;
 | 
						|
}
 | 
						|
\end{verbatim}
 | 
						|
\ttindex{sum_sequence()}
 | 
						|
 | 
						|
 | 
						|
\subsection{Types \label{types}}
 | 
						|
 | 
						|
There are few other data types that play a significant role in 
 | 
						|
the Python/C API; most are simple C types such as \ctype{int}, 
 | 
						|
\ctype{long}, \ctype{double} and \ctype{char*}.  A few structure types 
 | 
						|
are used to describe static tables used to list the functions exported 
 | 
						|
by a module or the data attributes of a new object type, and another
 | 
						|
is used to describe the value of a complex number.  These will 
 | 
						|
be discussed together with the functions that use them.
 | 
						|
 | 
						|
 | 
						|
\section{Exceptions \label{exceptions}}
 | 
						|
 | 
						|
The Python programmer only needs to deal with exceptions if specific 
 | 
						|
error handling is required; unhandled exceptions are automatically 
 | 
						|
propagated to the caller, then to the caller's caller, and so on, until
 | 
						|
they reach the top-level interpreter, where they are reported to the 
 | 
						|
user accompanied by a stack traceback.
 | 
						|
 | 
						|
For C programmers, however, error checking always has to be explicit.  
 | 
						|
All functions in the Python/C API can raise exceptions, unless an 
 | 
						|
explicit claim is made otherwise in a function's documentation.  In 
 | 
						|
general, when a function encounters an error, it sets an exception, 
 | 
						|
discards any object references that it owns, and returns an 
 | 
						|
error indicator --- usually \NULL{} or \code{-1}.  A few functions 
 | 
						|
return a Boolean true/false result, with false indicating an error.
 | 
						|
Very few functions return no explicit error indicator or have an 
 | 
						|
ambiguous return value, and require explicit testing for errors with 
 | 
						|
\cfunction{PyErr_Occurred()}\ttindex{PyErr_Occurred()}.
 | 
						|
 | 
						|
Exception state is maintained in per-thread storage (this is 
 | 
						|
equivalent to using global storage in an unthreaded application).  A 
 | 
						|
thread can be in one of two states: an exception has occurred, or not.
 | 
						|
The function \cfunction{PyErr_Occurred()} can be used to check for
 | 
						|
this: it returns a borrowed reference to the exception type object
 | 
						|
when an exception has occurred, and \NULL{} otherwise.  There are a
 | 
						|
number of functions to set the exception state:
 | 
						|
\cfunction{PyErr_SetString()}\ttindex{PyErr_SetString()} is the most
 | 
						|
common (though not the most general) function to set the exception
 | 
						|
state, and \cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} clears the
 | 
						|
exception state.
 | 
						|
 | 
						|
The full exception state consists of three objects (all of which can 
 | 
						|
be \NULL): the exception type, the corresponding exception 
 | 
						|
value, and the traceback.  These have the same meanings as the Python
 | 
						|
\withsubitem{(in module sys)}{
 | 
						|
  \ttindex{exc_type}\ttindex{exc_value}\ttindex{exc_traceback}}
 | 
						|
objects \code{sys.exc_type}, \code{sys.exc_value}, and
 | 
						|
\code{sys.exc_traceback}; however, they are not the same: the Python
 | 
						|
objects represent the last exception being handled by a Python 
 | 
						|
\keyword{try} \ldots\ \keyword{except} statement, while the C level
 | 
						|
exception state only exists while an exception is being passed on
 | 
						|
between C functions until it reaches the Python bytecode interpreter's 
 | 
						|
main loop, which takes care of transferring it to \code{sys.exc_type}
 | 
						|
and friends.
 | 
						|
 | 
						|
Note that starting with Python 1.5, the preferred, thread-safe way to 
 | 
						|
access the exception state from Python code is to call the function
 | 
						|
\withsubitem{(in module sys)}{\ttindex{exc_info()}}
 | 
						|
\function{sys.exc_info()}, which returns the per-thread exception state 
 | 
						|
for Python code.  Also, the semantics of both ways to access the 
 | 
						|
exception state have changed so that a function which catches an 
 | 
						|
exception will save and restore its thread's exception state so as to 
 | 
						|
preserve the exception state of its caller.  This prevents common bugs 
 | 
						|
in exception handling code caused by an innocent-looking function 
 | 
						|
overwriting the exception being handled; it also reduces the often 
 | 
						|
unwanted lifetime extension for objects that are referenced by the 
 | 
						|
stack frames in the traceback.
 | 
						|
 | 
						|
As a general principle, a function that calls another function to 
 | 
						|
perform some task should check whether the called function raised an 
 | 
						|
exception, and if so, pass the exception state on to its caller.  It 
 | 
						|
should discard any object references that it owns, and return an 
 | 
						|
error indicator, but it should \emph{not} set another exception ---
 | 
						|
that would overwrite the exception that was just raised, and lose
 | 
						|
important information about the exact cause of the error.
 | 
						|
 | 
						|
A simple example of detecting exceptions and passing them on is shown
 | 
						|
in the \cfunction{sum_sequence()}\ttindex{sum_sequence()} example
 | 
						|
above.  It so happens that that example doesn't need to clean up any
 | 
						|
owned references when it detects an error.  The following example
 | 
						|
function shows some error cleanup.  First, to remind you why you like
 | 
						|
Python, we show the equivalent Python code:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
def incr_item(dict, key):
 | 
						|
    try:
 | 
						|
        item = dict[key]
 | 
						|
    except KeyError:
 | 
						|
        item = 0
 | 
						|
    dict[key] = item + 1
 | 
						|
\end{verbatim}
 | 
						|
\ttindex{incr_item()}
 | 
						|
 | 
						|
Here is the corresponding C code, in all its glory:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
int
 | 
						|
incr_item(PyObject *dict, PyObject *key)
 | 
						|
{
 | 
						|
    /* Objects all initialized to NULL for Py_XDECREF */
 | 
						|
    PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL;
 | 
						|
    int rv = -1; /* Return value initialized to -1 (failure) */
 | 
						|
 | 
						|
    item = PyObject_GetItem(dict, key);
 | 
						|
    if (item == NULL) {
 | 
						|
        /* Handle KeyError only: */
 | 
						|
        if (!PyErr_ExceptionMatches(PyExc_KeyError))
 | 
						|
            goto error;
 | 
						|
 | 
						|
        /* Clear the error and use zero: */
 | 
						|
        PyErr_Clear();
 | 
						|
        item = PyInt_FromLong(0L);
 | 
						|
        if (item == NULL)
 | 
						|
            goto error;
 | 
						|
    }
 | 
						|
    const_one = PyInt_FromLong(1L);
 | 
						|
    if (const_one == NULL)
 | 
						|
        goto error;
 | 
						|
 | 
						|
    incremented_item = PyNumber_Add(item, const_one);
 | 
						|
    if (incremented_item == NULL)
 | 
						|
        goto error;
 | 
						|
 | 
						|
    if (PyObject_SetItem(dict, key, incremented_item) < 0)
 | 
						|
        goto error;
 | 
						|
    rv = 0; /* Success */
 | 
						|
    /* Continue with cleanup code */
 | 
						|
 | 
						|
 error:
 | 
						|
    /* Cleanup code, shared by success and failure path */
 | 
						|
 | 
						|
    /* Use Py_XDECREF() to ignore NULL references */
 | 
						|
    Py_XDECREF(item);
 | 
						|
    Py_XDECREF(const_one);
 | 
						|
    Py_XDECREF(incremented_item);
 | 
						|
 | 
						|
    return rv; /* -1 for error, 0 for success */
 | 
						|
}
 | 
						|
\end{verbatim}
 | 
						|
\ttindex{incr_item()}
 | 
						|
 | 
						|
This example represents an endorsed use of the \keyword{goto} statement 
 | 
						|
in C!  It illustrates the use of
 | 
						|
\cfunction{PyErr_ExceptionMatches()}\ttindex{PyErr_ExceptionMatches()} and
 | 
						|
\cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} to
 | 
						|
handle specific exceptions, and the use of
 | 
						|
\cfunction{Py_XDECREF()}\ttindex{Py_XDECREF()} to
 | 
						|
dispose of owned references that may be \NULL{} (note the
 | 
						|
\character{X} in the name; \cfunction{Py_DECREF()} would crash when
 | 
						|
confronted with a \NULL{} reference).  It is important that the
 | 
						|
variables used to hold owned references are initialized to \NULL{} for
 | 
						|
this to work; likewise, the proposed return value is initialized to
 | 
						|
\code{-1} (failure) and only set to success after the final call made
 | 
						|
is successful.
 | 
						|
 | 
						|
 | 
						|
\section{Embedding Python \label{embedding}}
 | 
						|
 | 
						|
The one important task that only embedders (as opposed to extension
 | 
						|
writers) of the Python interpreter have to worry about is the
 | 
						|
initialization, and possibly the finalization, of the Python
 | 
						|
interpreter.  Most functionality of the interpreter can only be used
 | 
						|
after the interpreter has been initialized.
 | 
						|
 | 
						|
The basic initialization function is
 | 
						|
\cfunction{Py_Initialize()}\ttindex{Py_Initialize()}.
 | 
						|
This initializes the table of loaded modules, and creates the
 | 
						|
fundamental modules \module{__builtin__}\refbimodindex{__builtin__},
 | 
						|
\module{__main__}\refbimodindex{__main__}, \module{sys}\refbimodindex{sys},
 | 
						|
and \module{exceptions}.\refbimodindex{exceptions}  It also initializes
 | 
						|
the module search path (\code{sys.path}).%
 | 
						|
\indexiii{module}{search}{path}
 | 
						|
\withsubitem{(in module sys)}{\ttindex{path}}
 | 
						|
 | 
						|
\cfunction{Py_Initialize()} does not set the ``script argument list'' 
 | 
						|
(\code{sys.argv}).  If this variable is needed by Python code that 
 | 
						|
will be executed later, it must be set explicitly with a call to 
 | 
						|
\code{PySys_SetArgv(\var{argc},
 | 
						|
\var{argv})}\ttindex{PySys_SetArgv()} subsequent to the call to
 | 
						|
\cfunction{Py_Initialize()}.
 | 
						|
 | 
						|
On most systems (in particular, on \UNIX{} and Windows, although the
 | 
						|
details are slightly different),
 | 
						|
\cfunction{Py_Initialize()} calculates the module search path based
 | 
						|
upon its best guess for the location of the standard Python
 | 
						|
interpreter executable, assuming that the Python library is found in a
 | 
						|
fixed location relative to the Python interpreter executable.  In
 | 
						|
particular, it looks for a directory named
 | 
						|
\file{lib/python\shortversion} relative to the parent directory where
 | 
						|
the executable named \file{python} is found on the shell command
 | 
						|
search path (the environment variable \envvar{PATH}).
 | 
						|
 | 
						|
For instance, if the Python executable is found in
 | 
						|
\file{/usr/local/bin/python}, it will assume that the libraries are in
 | 
						|
\file{/usr/local/lib/python\shortversion}.  (In fact, this particular path
 | 
						|
is also the ``fallback'' location, used when no executable file named
 | 
						|
\file{python} is found along \envvar{PATH}.)  The user can override
 | 
						|
this behavior by setting the environment variable \envvar{PYTHONHOME},
 | 
						|
or insert additional directories in front of the standard path by
 | 
						|
setting \envvar{PYTHONPATH}.
 | 
						|
 | 
						|
The embedding application can steer the search by calling 
 | 
						|
\code{Py_SetProgramName(\var{file})}\ttindex{Py_SetProgramName()} \emph{before} calling 
 | 
						|
\cfunction{Py_Initialize()}.  Note that \envvar{PYTHONHOME} still
 | 
						|
overrides this and \envvar{PYTHONPATH} is still inserted in front of
 | 
						|
the standard path.  An application that requires total control has to
 | 
						|
provide its own implementation of
 | 
						|
\cfunction{Py_GetPath()}\ttindex{Py_GetPath()},
 | 
						|
\cfunction{Py_GetPrefix()}\ttindex{Py_GetPrefix()},
 | 
						|
\cfunction{Py_GetExecPrefix()}\ttindex{Py_GetExecPrefix()}, and
 | 
						|
\cfunction{Py_GetProgramFullPath()}\ttindex{Py_GetProgramFullPath()} (all
 | 
						|
defined in \file{Modules/getpath.c}).
 | 
						|
 | 
						|
Sometimes, it is desirable to ``uninitialize'' Python.  For instance, 
 | 
						|
the application may want to start over (make another call to 
 | 
						|
\cfunction{Py_Initialize()}) or the application is simply done with its 
 | 
						|
use of Python and wants to free all memory allocated by Python.  This
 | 
						|
can be accomplished by calling \cfunction{Py_Finalize()}.  The function
 | 
						|
\cfunction{Py_IsInitialized()}\ttindex{Py_IsInitialized()} returns
 | 
						|
true if Python is currently in the initialized state.  More
 | 
						|
information about these functions is given in a later chapter.
 |