mirror of
https://github.com/python/cpython.git
synced 2025-10-24 07:26:11 +00:00

points out there are really two types of continued headers defined in this RFC (i.e. "encoded" parameters with the form "name*0*=" and unencoded parameters with the form "name*0="), but we were were handling them both the same way and that isn't correct. This patch should be much more RFC compliant in that only encoded params are %-decoded and the charset/language information is only extract if there are any encoded params in the segments. If there are no encoded params then the RFC says that there will be no charset/language parts. Note however that this will change the return value for Message.get_param() in some cases. For example, whereas before if you had all unencoded param continuations you would have still gotten a 3-tuple back from this method (with charset and language == None), you will now get just a string. I don't believe this is a backward incompatible change though because the documentation for this method already indicates that either return value is possible and that you must do an isinstance(val, tuple) check to discriminate between the two. (Yeah that API kind of sucks but we can't change /that/ without breaking code.) Test cases, some documentation updates, and a NEWS item accompany this patch.
402 lines
16 KiB
TeX
402 lines
16 KiB
TeX
% Copyright (C) 2001-2006 Python Software Foundation
|
|
% Author: barry@python.org (Barry Warsaw)
|
|
|
|
\section{\module{email} ---
|
|
An email and MIME handling package}
|
|
|
|
\declaremodule{standard}{email}
|
|
\modulesynopsis{Package supporting the parsing, manipulating, and
|
|
generating email messages, including MIME documents.}
|
|
\moduleauthor{Barry A. Warsaw}{barry@python.org}
|
|
\sectionauthor{Barry A. Warsaw}{barry@python.org}
|
|
|
|
\versionadded{2.2}
|
|
|
|
The \module{email} package is a library for managing email messages,
|
|
including MIME and other \rfc{2822}-based message documents. It
|
|
subsumes most of the functionality in several older standard modules
|
|
such as \refmodule{rfc822}, \refmodule{mimetools},
|
|
\refmodule{multifile}, and other non-standard packages such as
|
|
\module{mimecntl}. It is specifically \emph{not} designed to do any
|
|
sending of email messages to SMTP (\rfc{2821}), NNTP, or other servers; those
|
|
are functions of modules such as \refmodule{smtplib} and \refmodule{nntplib}.
|
|
The \module{email} package attempts to be as RFC-compliant as possible,
|
|
supporting in addition to \rfc{2822}, such MIME-related RFCs as
|
|
\rfc{2045}, \rfc{2046}, \rfc{2047}, and \rfc{2231}.
|
|
|
|
The primary distinguishing feature of the \module{email} package is
|
|
that it splits the parsing and generating of email messages from the
|
|
internal \emph{object model} representation of email. Applications
|
|
using the \module{email} package deal primarily with objects; you can
|
|
add sub-objects to messages, remove sub-objects from messages,
|
|
completely re-arrange the contents, etc. There is a separate parser
|
|
and a separate generator which handles the transformation from flat
|
|
text to the object model, and then back to flat text again. There
|
|
are also handy subclasses for some common MIME object types, and a few
|
|
miscellaneous utilities that help with such common tasks as extracting
|
|
and parsing message field values, creating RFC-compliant dates, etc.
|
|
|
|
The following sections describe the functionality of the
|
|
\module{email} package. The ordering follows a progression that
|
|
should be common in applications: an email message is read as flat
|
|
text from a file or other source, the text is parsed to produce the
|
|
object structure of the email message, this structure is manipulated,
|
|
and finally, the object tree is rendered back into flat text.
|
|
|
|
It is perfectly feasible to create the object structure out of whole
|
|
cloth --- i.e. completely from scratch. From there, a similar
|
|
progression can be taken as above.
|
|
|
|
Also included are detailed specifications of all the classes and
|
|
modules that the \module{email} package provides, the exception
|
|
classes you might encounter while using the \module{email} package,
|
|
some auxiliary utilities, and a few examples. For users of the older
|
|
\module{mimelib} package, or previous versions of the \module{email}
|
|
package, a section on differences and porting is provided.
|
|
|
|
\begin{seealso}
|
|
\seemodule{smtplib}{SMTP protocol client}
|
|
\seemodule{nntplib}{NNTP protocol client}
|
|
\end{seealso}
|
|
|
|
\subsection{Representing an email message}
|
|
\input{emailmessage}
|
|
|
|
\subsection{Parsing email messages}
|
|
\input{emailparser}
|
|
|
|
\subsection{Generating MIME documents}
|
|
\input{emailgenerator}
|
|
|
|
\subsection{Creating email and MIME objects from scratch}
|
|
\input{emailmimebase}
|
|
|
|
\subsection{Internationalized headers}
|
|
\input{emailheaders}
|
|
|
|
\subsection{Representing character sets}
|
|
\input{emailcharsets}
|
|
|
|
\subsection{Encoders}
|
|
\input{emailencoders}
|
|
|
|
\subsection{Exception and Defect classes}
|
|
\input{emailexc}
|
|
|
|
\subsection{Miscellaneous utilities}
|
|
\input{emailutil}
|
|
|
|
\subsection{Iterators}
|
|
\input{emailiter}
|
|
|
|
\subsection{Package History\label{email-pkg-history}}
|
|
|
|
This table describes the release history of the email package, corresponding
|
|
to the version of Python that the package was released with. For purposes of
|
|
this document, when you see a note about change or added versions, these refer
|
|
to the Python version the change was made it, \emph{not} the email package
|
|
version. This table also describes the Python compatibility of each version
|
|
of the package.
|
|
|
|
\begin{tableiii}{l|l|l}{constant}{email version}{distributed with}{compatible with}
|
|
\lineiii{1.x}{Python 2.2.0 to Python 2.2.1}{\emph{no longer supported}}
|
|
\lineiii{2.5}{Python 2.2.2+ and Python 2.3}{Python 2.1 to 2.5}
|
|
\lineiii{3.0}{Python 2.4}{Python 2.3 to 2.5}
|
|
\lineiii{4.0}{Python 2.5}{Python 2.3 to 2.5}
|
|
\end{tableiii}
|
|
|
|
Here are the major differences between \module{email} version 4 and version 3:
|
|
|
|
\begin{itemize}
|
|
\item All modules have been renamed according to \pep{8} standards. For
|
|
example, the version 3 module \module{email.Message} was renamed to
|
|
\module{email.message} in version 4.
|
|
|
|
\item A new subpackage \module{email.mime} was added and all the version 3
|
|
\module{email.MIME*} modules were renamed and situated into the
|
|
\module{email.mime} subpackage. For example, the version 3 module
|
|
\module{email.MIMEText} was renamed to \module{email.mime.text}.
|
|
|
|
\emph{Note that the version 3 names will continue to work until Python
|
|
2.6}.
|
|
|
|
\item The \module{email.mime.application} module was added, which contains the
|
|
\class{MIMEApplication} class.
|
|
|
|
\item Methods that were deprecated in version 3 have been removed. These
|
|
include \method{Generator.__call__()}, \method{Message.get_type()},
|
|
\method{Message.get_main_type()}, \method{Message.get_subtype()}.
|
|
|
|
\item Fixes have been added for \rfc{2231} support which can change some of
|
|
the return types for \function{Message.get_param()} and friends. Under
|
|
some circumstances, values which used to return a 3-tuple now return
|
|
simple strings (specifically, if all extended parameter segments were
|
|
unencoded, there is no language and charset designation expected, so the
|
|
return type is now a simple string). Also, \%-decoding used to be done
|
|
for both encoded and unencoded segments; this decoding is now done only
|
|
for encoded segments.
|
|
\end{itemize}
|
|
|
|
Here are the major differences between \module{email} version 3 and version 2:
|
|
|
|
\begin{itemize}
|
|
\item The \class{FeedParser} class was introduced, and the \class{Parser}
|
|
class was implemented in terms of the \class{FeedParser}. All parsing
|
|
therefore is non-strict, and parsing will make a best effort never to
|
|
raise an exception. Problems found while parsing messages are stored in
|
|
the message's \var{defect} attribute.
|
|
|
|
\item All aspects of the API which raised \exception{DeprecationWarning}s in
|
|
version 2 have been removed. These include the \var{_encoder} argument
|
|
to the \class{MIMEText} constructor, the \method{Message.add_payload()}
|
|
method, the \function{Utils.dump_address_pair()} function, and the
|
|
functions \function{Utils.decode()} and \function{Utils.encode()}.
|
|
|
|
\item New \exception{DeprecationWarning}s have been added to:
|
|
\method{Generator.__call__()}, \method{Message.get_type()},
|
|
\method{Message.get_main_type()}, \method{Message.get_subtype()}, and
|
|
the \var{strict} argument to the \class{Parser} class. These are
|
|
expected to be removed in future versions.
|
|
|
|
\item Support for Pythons earlier than 2.3 has been removed.
|
|
\end{itemize}
|
|
|
|
Here are the differences between \module{email} version 2 and version 1:
|
|
|
|
\begin{itemize}
|
|
\item The \module{email.Header} and \module{email.Charset} modules
|
|
have been added.
|
|
|
|
\item The pickle format for \class{Message} instances has changed.
|
|
Since this was never (and still isn't) formally defined, this
|
|
isn't considered a backward incompatibility. However if your
|
|
application pickles and unpickles \class{Message} instances, be
|
|
aware that in \module{email} version 2, \class{Message}
|
|
instances now have private variables \var{_charset} and
|
|
\var{_default_type}.
|
|
|
|
\item Several methods in the \class{Message} class have been
|
|
deprecated, or their signatures changed. Also, many new methods
|
|
have been added. See the documentation for the \class{Message}
|
|
class for details. The changes should be completely backward
|
|
compatible.
|
|
|
|
\item The object structure has changed in the face of
|
|
\mimetype{message/rfc822} content types. In \module{email}
|
|
version 1, such a type would be represented by a scalar payload,
|
|
i.e. the container message's \method{is_multipart()} returned
|
|
false, \method{get_payload()} was not a list object, but a single
|
|
\class{Message} instance.
|
|
|
|
This structure was inconsistent with the rest of the package, so
|
|
the object representation for \mimetype{message/rfc822} content
|
|
types was changed. In \module{email} version 2, the container
|
|
\emph{does} return \code{True} from \method{is_multipart()}, and
|
|
\method{get_payload()} returns a list containing a single
|
|
\class{Message} item.
|
|
|
|
Note that this is one place that backward compatibility could
|
|
not be completely maintained. However, if you're already
|
|
testing the return type of \method{get_payload()}, you should be
|
|
fine. You just need to make sure your code doesn't do a
|
|
\method{set_payload()} with a \class{Message} instance on a
|
|
container with a content type of \mimetype{message/rfc822}.
|
|
|
|
\item The \class{Parser} constructor's \var{strict} argument was
|
|
added, and its \method{parse()} and \method{parsestr()} methods
|
|
grew a \var{headersonly} argument. The \var{strict} flag was
|
|
also added to functions \function{email.message_from_file()}
|
|
and \function{email.message_from_string()}.
|
|
|
|
\item \method{Generator.__call__()} is deprecated; use
|
|
\method{Generator.flatten()} instead. The \class{Generator}
|
|
class has also grown the \method{clone()} method.
|
|
|
|
\item The \class{DecodedGenerator} class in the
|
|
\module{email.Generator} module was added.
|
|
|
|
\item The intermediate base classes \class{MIMENonMultipart} and
|
|
\class{MIMEMultipart} have been added, and interposed in the
|
|
class hierarchy for most of the other MIME-related derived
|
|
classes.
|
|
|
|
\item The \var{_encoder} argument to the \class{MIMEText} constructor
|
|
has been deprecated. Encoding now happens implicitly based
|
|
on the \var{_charset} argument.
|
|
|
|
\item The following functions in the \module{email.Utils} module have
|
|
been deprecated: \function{dump_address_pairs()},
|
|
\function{decode()}, and \function{encode()}. The following
|
|
functions have been added to the module:
|
|
\function{make_msgid()}, \function{decode_rfc2231()},
|
|
\function{encode_rfc2231()}, and \function{decode_params()}.
|
|
|
|
\item The non-public function \function{email.Iterators._structure()}
|
|
was added.
|
|
\end{itemize}
|
|
|
|
\subsection{Differences from \module{mimelib}}
|
|
|
|
The \module{email} package was originally prototyped as a separate
|
|
library called
|
|
\ulink{\module{mimelib}}{http://mimelib.sf.net/}.
|
|
Changes have been made so that
|
|
method names are more consistent, and some methods or modules have
|
|
either been added or removed. The semantics of some of the methods
|
|
have also changed. For the most part, any functionality available in
|
|
\module{mimelib} is still available in the \refmodule{email} package,
|
|
albeit often in a different way. Backward compatibility between
|
|
the \module{mimelib} package and the \module{email} package was not a
|
|
priority.
|
|
|
|
Here is a brief description of the differences between the
|
|
\module{mimelib} and the \refmodule{email} packages, along with hints on
|
|
how to port your applications.
|
|
|
|
Of course, the most visible difference between the two packages is
|
|
that the package name has been changed to \refmodule{email}. In
|
|
addition, the top-level package has the following differences:
|
|
|
|
\begin{itemize}
|
|
\item \function{messageFromString()} has been renamed to
|
|
\function{message_from_string()}.
|
|
|
|
\item \function{messageFromFile()} has been renamed to
|
|
\function{message_from_file()}.
|
|
|
|
\end{itemize}
|
|
|
|
The \class{Message} class has the following differences:
|
|
|
|
\begin{itemize}
|
|
\item The method \method{asString()} was renamed to \method{as_string()}.
|
|
|
|
\item The method \method{ismultipart()} was renamed to
|
|
\method{is_multipart()}.
|
|
|
|
\item The \method{get_payload()} method has grown a \var{decode}
|
|
optional argument.
|
|
|
|
\item The method \method{getall()} was renamed to \method{get_all()}.
|
|
|
|
\item The method \method{addheader()} was renamed to \method{add_header()}.
|
|
|
|
\item The method \method{gettype()} was renamed to \method{get_type()}.
|
|
|
|
\item The method \method{getmaintype()} was renamed to
|
|
\method{get_main_type()}.
|
|
|
|
\item The method \method{getsubtype()} was renamed to
|
|
\method{get_subtype()}.
|
|
|
|
\item The method \method{getparams()} was renamed to
|
|
\method{get_params()}.
|
|
Also, whereas \method{getparams()} returned a list of strings,
|
|
\method{get_params()} returns a list of 2-tuples, effectively
|
|
the key/value pairs of the parameters, split on the \character{=}
|
|
sign.
|
|
|
|
\item The method \method{getparam()} was renamed to \method{get_param()}.
|
|
|
|
\item The method \method{getcharsets()} was renamed to
|
|
\method{get_charsets()}.
|
|
|
|
\item The method \method{getfilename()} was renamed to
|
|
\method{get_filename()}.
|
|
|
|
\item The method \method{getboundary()} was renamed to
|
|
\method{get_boundary()}.
|
|
|
|
\item The method \method{setboundary()} was renamed to
|
|
\method{set_boundary()}.
|
|
|
|
\item The method \method{getdecodedpayload()} was removed. To get
|
|
similar functionality, pass the value 1 to the \var{decode} flag
|
|
of the {get_payload()} method.
|
|
|
|
\item The method \method{getpayloadastext()} was removed. Similar
|
|
functionality
|
|
is supported by the \class{DecodedGenerator} class in the
|
|
\refmodule{email.generator} module.
|
|
|
|
\item The method \method{getbodyastext()} was removed. You can get
|
|
similar functionality by creating an iterator with
|
|
\function{typed_subpart_iterator()} in the
|
|
\refmodule{email.iterators} module.
|
|
\end{itemize}
|
|
|
|
The \class{Parser} class has no differences in its public interface.
|
|
It does have some additional smarts to recognize
|
|
\mimetype{message/delivery-status} type messages, which it represents as
|
|
a \class{Message} instance containing separate \class{Message}
|
|
subparts for each header block in the delivery status
|
|
notification\footnote{Delivery Status Notifications (DSN) are defined
|
|
in \rfc{1894}.}.
|
|
|
|
The \class{Generator} class has no differences in its public
|
|
interface. There is a new class in the \refmodule{email.generator}
|
|
module though, called \class{DecodedGenerator} which provides most of
|
|
the functionality previously available in the
|
|
\method{Message.getpayloadastext()} method.
|
|
|
|
The following modules and classes have been changed:
|
|
|
|
\begin{itemize}
|
|
\item The \class{MIMEBase} class constructor arguments \var{_major}
|
|
and \var{_minor} have changed to \var{_maintype} and
|
|
\var{_subtype} respectively.
|
|
|
|
\item The \code{Image} class/module has been renamed to
|
|
\code{MIMEImage}. The \var{_minor} argument has been renamed to
|
|
\var{_subtype}.
|
|
|
|
\item The \code{Text} class/module has been renamed to
|
|
\code{MIMEText}. The \var{_minor} argument has been renamed to
|
|
\var{_subtype}.
|
|
|
|
\item The \code{MessageRFC822} class/module has been renamed to
|
|
\code{MIMEMessage}. Note that an earlier version of
|
|
\module{mimelib} called this class/module \code{RFC822}, but
|
|
that clashed with the Python standard library module
|
|
\refmodule{rfc822} on some case-insensitive file systems.
|
|
|
|
Also, the \class{MIMEMessage} class now represents any kind of
|
|
MIME message with main type \mimetype{message}. It takes an
|
|
optional argument \var{_subtype} which is used to set the MIME
|
|
subtype. \var{_subtype} defaults to \mimetype{rfc822}.
|
|
\end{itemize}
|
|
|
|
\module{mimelib} provided some utility functions in its
|
|
\module{address} and \module{date} modules. All of these functions
|
|
have been moved to the \refmodule{email.utils} module.
|
|
|
|
The \code{MsgReader} class/module has been removed. Its functionality
|
|
is most closely supported in the \function{body_line_iterator()}
|
|
function in the \refmodule{email.iterators} module.
|
|
|
|
\subsection{Examples}
|
|
|
|
Here are a few examples of how to use the \module{email} package to
|
|
read, write, and send simple email messages, as well as more complex
|
|
MIME messages.
|
|
|
|
First, let's see how to create and send a simple text message:
|
|
|
|
\verbatiminput{email-simple.py}
|
|
|
|
Here's an example of how to send a MIME message containing a bunch of
|
|
family pictures that may be residing in a directory:
|
|
|
|
\verbatiminput{email-mime.py}
|
|
|
|
Here's an example of how to send the entire contents of a directory as
|
|
an email message:
|
|
\footnote{Thanks to Matthew Dixon Cowles for the original inspiration
|
|
and examples.}
|
|
|
|
\verbatiminput{email-dir.py}
|
|
|
|
And finally, here's an example of how to unpack a MIME message like
|
|
the one above, into a directory of files:
|
|
|
|
\verbatiminput{email-unpack.py}
|