mirror of
https://github.com/python/cpython.git
synced 2025-08-03 08:34:29 +00:00
Proofread and spell checked, all except the Examples section (which
I'll do next).
This commit is contained in:
parent
cc3a6df506
commit
5db478fa29
9 changed files with 350 additions and 357 deletions
|
@ -39,14 +39,13 @@ and parsing message field values, creating RFC-compliant dates, etc.
|
|||
The following sections describe the functionality of the
|
||||
\module{email} package. The ordering follows a progression that
|
||||
should be common in applications: an email message is read as flat
|
||||
text from a file or other source, the text is parsed to produce an
|
||||
object model representation of the email message, this model is
|
||||
manipulated, and finally the model is rendered back into
|
||||
flat text.
|
||||
text from a file or other source, the text is parsed to produce the
|
||||
object structure of the email message, this structure is manipulated,
|
||||
and finally rendered back into flat text.
|
||||
|
||||
It is perfectly feasible to create the object model out of whole cloth
|
||||
--- i.e. completely from scratch. From there, a similar progression
|
||||
can be taken as above.
|
||||
It is perfectly feasible to create the object structure out of whole
|
||||
cloth --- i.e. completely from scratch. From there, a similar
|
||||
progression can be taken as above.
|
||||
|
||||
Also included are detailed specifications of all the classes and
|
||||
modules that the \module{email} package provides, the exception
|
||||
|
@ -71,9 +70,12 @@ package, a section on differences and porting is provided.
|
|||
\subsection{Creating email and MIME objects from scratch}
|
||||
\input{emailmimebase}
|
||||
|
||||
\subsection{Headers, Character sets, and Internationalization}
|
||||
\subsection{Internationalized headers}
|
||||
\input{emailheaders}
|
||||
|
||||
\subsection{Representing character sets}
|
||||
\input{emailcharsets}
|
||||
|
||||
\subsection{Encoders}
|
||||
\input{emailencoders}
|
||||
|
||||
|
@ -92,7 +94,7 @@ Version 1 of the \module{email} package was bundled with Python
|
|||
releases up to Python 2.2.1. Version 2 was developed for the Python
|
||||
2.3 release, and backported to Python 2.2.2. It was also available as
|
||||
a separate distutils based package. \module{email} version 2 is
|
||||
almost entirely backwards compatible with version 1, with the
|
||||
almost entirely backward compatible with version 1, with the
|
||||
following differences:
|
||||
|
||||
\begin{itemize}
|
||||
|
@ -100,31 +102,31 @@ following differences:
|
|||
have been added.
|
||||
\item The pickle format for \class{Message} instances has changed.
|
||||
Since this was never (and still isn't) formally defined, this
|
||||
isn't considered a backwards incompatibility. However if your
|
||||
isn't considered a backward incompatibility. However if your
|
||||
application pickles and unpickles \class{Message} instances, be
|
||||
aware that in \module{email} version 2, \class{Message}
|
||||
instances now have private variables \var{_charset} and
|
||||
\var{_default_type}.
|
||||
\item Several methods in the \class{Message} class have been
|
||||
deprecated, or their signatures changes. Also, many new methods
|
||||
deprecated, or their signatures changed. Also, many new methods
|
||||
have been added. See the documentation for the \class{Message}
|
||||
class for deatils. The changes should be completely backwards
|
||||
class for details. The changes should be completely backward
|
||||
compatible.
|
||||
\item The object structure has changed in the face of
|
||||
\mimetype{message/rfc822} content types. In \module{email}
|
||||
version 1, such a type would be represented by a scalar payload,
|
||||
i.e. the container message's \method{is_multipart()} returned
|
||||
false, \method{get_payload()} was not a list object, and was
|
||||
actually a \class{Message} instance.
|
||||
false, \method{get_payload()} was not a list object, but a single
|
||||
\class{Message} instance.
|
||||
|
||||
This structure was inconsistent with the rest of the package, so
|
||||
the object representation for \mimetype{message/rfc822} content
|
||||
types was changed. In module{email} version 2, the container
|
||||
types was changed. In \module{email} version 2, the container
|
||||
\emph{does} return \code{True} from \method{is_multipart()}, and
|
||||
\method{get_payload()} returns a list containing a single
|
||||
\class{Message} item.
|
||||
|
||||
Note that this is one place that backwards compatibility could
|
||||
Note that this is one place that backward compatibility could
|
||||
not be completely maintained. However, if you're already
|
||||
testing the return type of \method{get_payload()}, you should be
|
||||
fine. You just need to make sure your code doesn't do a
|
||||
|
@ -142,7 +144,7 @@ following differences:
|
|||
\module{email.Generator} module was added.
|
||||
\item The intermediate base classes \class{MIMENonMultipart} and
|
||||
\class{MIMEMultipart} have been added, and interposed in the
|
||||
class heirarchy for most of the other MIME-related derived
|
||||
class hierarchy for most of the other MIME-related derived
|
||||
classes.
|
||||
\item The \var{_encoder} argument to the \class{MIMEText} constructor
|
||||
has been deprecated. Encoding now happens implicitly based
|
||||
|
@ -167,7 +169,9 @@ method names are more consistent, and some methods or modules have
|
|||
either been added or removed. The semantics of some of the methods
|
||||
have also changed. For the most part, any functionality available in
|
||||
\module{mimelib} is still available in the \refmodule{email} package,
|
||||
albeit often in a different way.
|
||||
albeit often in a different way. Backward compatibility between
|
||||
the \module{mimelib} package and the \module{email} package was not a
|
||||
priority.
|
||||
|
||||
Here is a brief description of the differences between the
|
||||
\module{mimelib} and the \refmodule{email} packages, along with hints on
|
||||
|
|
240
Doc/lib/emailcharsets.tex
Normal file
240
Doc/lib/emailcharsets.tex
Normal file
|
@ -0,0 +1,240 @@
|
|||
\declaremodule{standard}{email.Charset}
|
||||
\modulesynopsis{Character Sets}
|
||||
|
||||
This module provides a class \class{Charset} for representing
|
||||
character sets and character set conversions in email messages, as
|
||||
well as a character set registry and several convenience methods for
|
||||
manipulating this registry. Instances of \class{Charset} are used in
|
||||
several other modules within the \module{email} package.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
|
||||
\begin{classdesc}{Charset}{\optional{input_charset}}
|
||||
Map character sets to their email properties.
|
||||
|
||||
This class provides information about the requirements imposed on
|
||||
email for a specific character set. It also provides convenience
|
||||
routines for converting between character sets, given the availability
|
||||
of the applicable codecs. Given a character set, it will do its best
|
||||
to provide information on how to use that character set in an email
|
||||
message in an RFC-compliant way.
|
||||
|
||||
Certain character sets must be encoded with quoted-printable or base64
|
||||
when used in email headers or bodies. Certain character sets must be
|
||||
converted outright, and are not allowed in email.
|
||||
|
||||
Optional \var{input_charset} is as described below. After being alias
|
||||
normalized it is also used as a lookup into the registry of character
|
||||
sets to find out the header encoding, body encoding, and output
|
||||
conversion codec to be used for the character set. For example, if
|
||||
\var{input_charset} is \code{iso-8859-1}, then headers and bodies will
|
||||
be encoded using quoted-printable and no output conversion codec is
|
||||
necessary. If \var{input_charset} is \code{euc-jp}, then headers will
|
||||
be encoded with base64, bodies will not be encoded, but output text
|
||||
will be converted from the \code{euc-jp} character set to the
|
||||
\code{iso-2022-jp} character set.
|
||||
\end{classdesc}
|
||||
|
||||
\class{Charset} instances have the following data attributes:
|
||||
|
||||
\begin{datadesc}{input_charset}
|
||||
The initial character set specified. Common aliases are converted to
|
||||
their \emph{official} email names (e.g. \code{latin_1} is converted to
|
||||
\code{iso-8859-1}). Defaults to 7-bit \code{us-ascii}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{header_encoding}
|
||||
If the character set must be encoded before it can be used in an
|
||||
email header, this attribute will be set to \code{Charset.QP} (for
|
||||
quoted-printable), \code{Charset.BASE64} (for base64 encoding), or
|
||||
\code{Charset.SHORTEST} for the shortest of QP or BASE64 encoding.
|
||||
Otherwise, it will be \code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{body_encoding}
|
||||
Same as \var{header_encoding}, but describes the encoding for the
|
||||
mail message's body, which indeed may be different than the header
|
||||
encoding. \code{Charset.SHORTEST} is not allowed for
|
||||
\var{body_encoding}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{output_charset}
|
||||
Some character sets must be converted before they can be used in
|
||||
email headers or bodies. If the \var{input_charset} is one of
|
||||
them, this attribute will contain the name of the character set
|
||||
output will be converted to. Otherwise, it will be \code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{input_codec}
|
||||
The name of the Python codec used to convert the \var{input_charset} to
|
||||
Unicode. If no conversion codec is necessary, this attribute will be
|
||||
\code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{output_codec}
|
||||
The name of the Python codec used to convert Unicode to the
|
||||
\var{output_charset}. If no conversion codec is necessary, this
|
||||
attribute will have the same value as the \var{input_codec}.
|
||||
\end{datadesc}
|
||||
|
||||
\class{Charset} instances also have the following methods:
|
||||
|
||||
\begin{methoddesc}[Charset]{get_body_encoding}{}
|
||||
Return the content transfer encoding used for body encoding.
|
||||
|
||||
This is either the string \samp{quoted-printable} or \samp{base64}
|
||||
depending on the encoding used, or it is a function, in which case you
|
||||
should call the function with a single argument, the Message object
|
||||
being encoded. The function should then set the
|
||||
\mailheader{Content-Transfer-Encoding} header itself to whatever is
|
||||
appropriate.
|
||||
|
||||
Returns the string \samp{quoted-printable} if
|
||||
\var{body_encoding} is \code{QP}, returns the string
|
||||
\samp{base64} if \var{body_encoding} is \code{BASE64}, and returns the
|
||||
string \samp{7bit} otherwise.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{convert}{s}
|
||||
Convert the string \var{s} from the \var{input_codec} to the
|
||||
\var{output_codec}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{to_splittable}{s}
|
||||
Convert a possibly multibyte string to a safely splittable format.
|
||||
\var{s} is the string to split.
|
||||
|
||||
Uses the \var{input_codec} to try and convert the string to Unicode,
|
||||
so it can be safely split on character boundaries (even for multibyte
|
||||
characters).
|
||||
|
||||
Returns the string as-is if it isn't known how to convert \var{s} to
|
||||
Unicode with the \var{input_charset}.
|
||||
|
||||
Characters that could not be converted to Unicode will be replaced
|
||||
with the Unicode replacement character \character{U+FFFD}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{from_splittable}{ustr\optional{, to_output}}
|
||||
Convert a splittable string back into an encoded string. \var{ustr}
|
||||
is a Unicode string to ``unsplit''.
|
||||
|
||||
This method uses the proper codec to try and convert the string from
|
||||
Unicode back into an encoded format. Return the string as-is if it is
|
||||
not Unicode, or if it could not be converted from Unicode.
|
||||
|
||||
Characters that could not be converted from Unicode will be replaced
|
||||
with an appropriate character (usually \character{?}).
|
||||
|
||||
If \var{to_output} is \code{True} (the default), uses
|
||||
\var{output_codec} to convert to an
|
||||
encoded format. If \var{to_output} is \code{False}, it uses
|
||||
\var{input_codec}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{get_output_charset}{}
|
||||
Return the output character set.
|
||||
|
||||
This is the \var{output_charset} attribute if that is not \code{None},
|
||||
otherwise it is \var{input_charset}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{encoded_header_len}{}
|
||||
Return the length of the encoded header string, properly calculating
|
||||
for quoted-printable or base64 encoding.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{header_encode}{s\optional{, convert}}
|
||||
Header-encode the string \var{s}.
|
||||
|
||||
If \var{convert} is \code{True}, the string will be converted from the
|
||||
input charset to the output charset automatically. This is not useful
|
||||
for multibyte character sets, which have line length issues (multibyte
|
||||
characters must be split on a character, not a byte boundary); use the
|
||||
higher-level \class{Header} class to deal with these issues (see
|
||||
\refmodule{email.Header}). \var{convert} defaults to \code{False}.
|
||||
|
||||
The type of encoding (base64 or quoted-printable) will be based on
|
||||
the \var{header_encoding} attribute.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{body_encode}{s\optional{, convert}}
|
||||
Body-encode the string \var{s}.
|
||||
|
||||
If \var{convert} is \code{True} (the default), the string will be
|
||||
converted from the input charset to output charset automatically.
|
||||
Unlike \method{header_encode()}, there are no issues with byte
|
||||
boundaries and multibyte charsets in email bodies, so this is usually
|
||||
pretty safe.
|
||||
|
||||
The type of encoding (base64 or quoted-printable) will be based on
|
||||
the \var{body_encoding} attribute.
|
||||
\end{methoddesc}
|
||||
|
||||
The \class{Charset} class also provides a number of methods to support
|
||||
standard operations and built-in functions.
|
||||
|
||||
\begin{methoddesc}[Charset]{__str__}{}
|
||||
Returns \var{input_charset} as a string coerced to lower case.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Charset]{__eq__}{other}
|
||||
This method allows you to compare two \class{Charset} instances for equality.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Header]{__ne__}{other}
|
||||
This method allows you to compare two \class{Charset} instances for inequality.
|
||||
\end{methoddesc}
|
||||
|
||||
The \module{email.Charset} module also provides the following
|
||||
functions for adding new entries to the global character set, alias,
|
||||
and codec registries:
|
||||
|
||||
\begin{funcdesc}{add_charset}{charset\optional{, header_enc\optional{,
|
||||
body_enc\optional{, output_charset}}}}
|
||||
Add character properties to the global registry.
|
||||
|
||||
\var{charset} is the input character set, and must be the canonical
|
||||
name of a character set.
|
||||
|
||||
Optional \var{header_enc} and \var{body_enc} is either
|
||||
\code{Charset.QP} for quoted-printable, \code{Charset.BASE64} for
|
||||
base64 encoding, \code{Charset.SHORTEST} for the shortest of
|
||||
quoted-printable or base64 encoding, or \code{None} for no encoding.
|
||||
\code{SHORTEST} is only valid for \var{header_enc}. The default is
|
||||
\code{None} for no encoding.
|
||||
|
||||
Optional \var{output_charset} is the character set that the output
|
||||
should be in. Conversions will proceed from input charset, to
|
||||
Unicode, to the output charset when the method
|
||||
\method{Charset.convert()} is called. The default is to output in the
|
||||
same character set as the input.
|
||||
|
||||
Both \var{input_charset} and \var{output_charset} must have Unicode
|
||||
codec entries in the module's character set-to-codec mapping; use
|
||||
\function{add_codec()} to add codecs the module does
|
||||
not know about. See the \refmodule{codecs} module's documentation for
|
||||
more information.
|
||||
|
||||
The global character set registry is kept in the module global
|
||||
dictionary \code{CHARSETS}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{add_alias}{alias, canonical}
|
||||
Add a character set alias. \var{alias} is the alias name,
|
||||
e.g. \code{latin-1}. \var{canonical} is the character set's canonical
|
||||
name, e.g. \code{iso-8859-1}.
|
||||
|
||||
The global charset alias registry is kept in the module global
|
||||
dictionary \code{ALIASES}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{add_codec}{charset, codecname}
|
||||
Add a codec that map characters in the given character set to and from
|
||||
Unicode.
|
||||
|
||||
\var{charset} is the canonical name of a character set.
|
||||
\var{codecname} is the name of a Python codec, as appropriate for the
|
||||
second argument to the \function{unicode()} built-in, or to the
|
||||
\method{encode()} method of a Unicode string.
|
||||
\end{funcdesc}
|
|
@ -17,7 +17,7 @@ set the \mailheader{Content-Transfer-Encoding} header as appropriate.
|
|||
Here are the encoding functions provided:
|
||||
|
||||
\begin{funcdesc}{encode_quopri}{msg}
|
||||
Encodes the payload into quoted-Printable form and sets the
|
||||
Encodes the payload into quoted-printable form and sets the
|
||||
\mailheader{Content-Transfer-Encoding} header to
|
||||
\code{quoted-printable}\footnote{Note that encoding with
|
||||
\method{encode_quopri()} also encodes all tabs and space characters in
|
||||
|
|
|
@ -24,12 +24,12 @@ Here are the public methods of the \class{Generator} class:
|
|||
The constructor for the \class{Generator} class takes a file-like
|
||||
object called \var{outfp} for an argument. \var{outfp} must support
|
||||
the \method{write()} method and be usable as the output file in a
|
||||
Python 2.0 extended print statement.
|
||||
Python extended print statement.
|
||||
|
||||
Optional \var{mangle_from_} is a flag that, when \code{True}, puts a
|
||||
\samp{>} character in front of any line in the body that starts exactly as
|
||||
\samp{From } (i.e. \code{From} followed by a space at the front of the
|
||||
line). This is the only guaranteed portable way to avoid having such
|
||||
\samp{From }, i.e. \code{From} followed by a space at the beginning of the
|
||||
line. This is the only guaranteed portable way to avoid having such
|
||||
lines be mistaken for a Unix mailbox format envelope header separator (see
|
||||
\ulink{WHY THE CONTENT-LENGTH FORMAT IS BAD}
|
||||
{http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html}
|
||||
|
@ -48,10 +48,10 @@ recommended (but not required) by \rfc{2822}.
|
|||
|
||||
The other public \class{Generator} methods are:
|
||||
|
||||
\begin{methoddesc}[Generator]{flatten()}{msg\optional{, unixfrom}}
|
||||
\begin{methoddesc}[Generator]{flatten}{msg\optional{, unixfrom}}
|
||||
Print the textual representation of the message object structure rooted at
|
||||
\var{msg} to the output file specified when the \class{Generator}
|
||||
instance was created. Sub-objects are visited depth-first and the
|
||||
instance was created. Subparts are visited depth-first and the
|
||||
resulting text will be properly MIME encoded.
|
||||
|
||||
Optional \var{unixfrom} is a flag that forces the printing of the
|
||||
|
@ -60,7 +60,7 @@ root message object. If the root object has no envelope header, a
|
|||
standard one is crafted. By default, this is set to \code{False} to
|
||||
inhibit the printing of the envelope delimiter.
|
||||
|
||||
Note that for sub-objects, no envelope header is ever printed.
|
||||
Note that for subparts, no envelope header is ever printed.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
@ -99,16 +99,20 @@ Optional \var{_mangle_from_} and \var{maxheaderlen} are as with the
|
|||
\class{Generator} base class.
|
||||
|
||||
If the subpart is not of main type \mimetype{text}, optional \var{fmt}
|
||||
is a format string that is used instead of the message
|
||||
payload. \var{fmt} is expanded with the following keywords (in
|
||||
\samp{\%(keyword)s} format):
|
||||
is a format string that is used instead of the message payload.
|
||||
\var{fmt} is expanded with the following keywords, \samp{\%(keyword)s}
|
||||
format:
|
||||
|
||||
type : Full MIME type of the non-\mimetype{text} part
|
||||
maintype : Main MIME type of the non-\mimetype{text} part
|
||||
subtype : Sub-MIME type of the non-\mimetype{text} part
|
||||
filename : Filename of the non-\mimetype{text} part
|
||||
description: Description associated with the non-\mimetype{text} part
|
||||
encoding : Content transfer encoding of the non-\mimetype{text} part
|
||||
\begin{itemize}
|
||||
\item \code{type} -- Full MIME type of the non-\mimetype{text} part
|
||||
\item \code{maintype} -- Main MIME type of the non-\mimetype{text} part
|
||||
\item \code{subtype} -- Sub-MIME type of the non-\mimetype{text} part
|
||||
\item \code{filename} -- Filename of the non-\mimetype{text} part
|
||||
\item \code{description} -- Description associated with the
|
||||
non-\mimetype{text} part
|
||||
\item \code{encoding} -- Content transfer encoding of the
|
||||
non-\mimetype{text} part
|
||||
\end{itemize}
|
||||
|
||||
The default value for \var{fmt} is \code{None}, meaning
|
||||
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
|
||||
\rfc{2822} is the base standard that describes the format of email
|
||||
messages. It derives from the older \rfc{822} standard which came
|
||||
into widespread at a time when most email was composed of \ASCII{}
|
||||
into widespread use at a time when most email was composed of \ASCII{}
|
||||
characters only. \rfc{2822} is a specification written assuming email
|
||||
contains only 7-bit \ASCII{} characters.
|
||||
|
||||
|
@ -19,10 +19,9 @@ The \module{email} package supports these standards in its
|
|||
|
||||
If you want to include non-\ASCII{} characters in your email headers,
|
||||
say in the \mailheader{Subject} or \mailheader{To} fields, you should
|
||||
use the \class{Header} class (in module \module{email.Header} and
|
||||
assign the field in the \class{Message} object to an instance of
|
||||
\class{Header} instead of using a string for the header value. For
|
||||
example:
|
||||
use the \class{Header} class and assign the field in the
|
||||
\class{Message} object to an instance of \class{Header} instead of
|
||||
using a string for the header value. For example:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> from email.Message import Message
|
||||
|
@ -50,7 +49,8 @@ Here is the \class{Header} class description:
|
|||
|
||||
\begin{classdesc}{Header}{\optional{s\optional{, charset\optional{,
|
||||
maxlinelen\optional{, header_name\optional{, continuation_ws}}}}}}
|
||||
Create a MIME-compliant header that can contain many character sets.
|
||||
Create a MIME-compliant header that can contain strings in different
|
||||
character sets.
|
||||
|
||||
Optional \var{s} is the initial header value. If \code{None} (the
|
||||
default), the initial header value is not set. You can later append
|
||||
|
@ -74,7 +74,7 @@ e.g. \mailheader{Subject}) pass in the name of the field in
|
|||
default value for \var{header_name} is \code{None}, meaning it is not
|
||||
taken into account for the first line of a long, split header.
|
||||
|
||||
Optional \var{continuation_ws} must be RFC 2822 compliant folding
|
||||
Optional \var{continuation_ws} must be \rfc{2822}-compliant folding
|
||||
whitespace, and is usually either a space or a hard tab character.
|
||||
This character will be prepended to continuation lines.
|
||||
\end{classdesc}
|
||||
|
@ -89,7 +89,7 @@ will be converted to a \class{Charset} instance. A value of
|
|||
constructor is used.
|
||||
|
||||
\var{s} may be a byte string or a Unicode string. If it is a byte
|
||||
string (i.e. \code{isinstance(s, StringType)} is true), then
|
||||
string (i.e. \code{isinstance(s, str)} is true), then
|
||||
\var{charset} is the encoding of that byte string, and a
|
||||
\exception{UnicodeError} will be raised if the string cannot be
|
||||
decoded with that character set.
|
||||
|
@ -113,7 +113,7 @@ standard operators and built-in functions.
|
|||
|
||||
\begin{methoddesc}[Header]{__str__}{}
|
||||
A synonym for \method{Header.encode()}. Useful for
|
||||
\code{str(aHeader)} calls.
|
||||
\code{str(aHeader)}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Header]{__unicode__}{}
|
||||
|
@ -165,245 +165,3 @@ This function takes one of those sequence of pairs and returns a
|
|||
\var{header_name}, and \var{continuation_ws} are as in the
|
||||
\class{Header} constructor.
|
||||
\end{funcdesc}
|
||||
|
||||
\declaremodule{standard}{email.Charset}
|
||||
\modulesynopsis{Character Sets}
|
||||
|
||||
This module provides a class \class{Charset} for representing
|
||||
character sets and character set conversions in email messages, as
|
||||
well as a character set registry and several convenience methods for
|
||||
manipulating this registry. Instances of \class{Charset} are used in
|
||||
several other modules within the \module{email} package.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
|
||||
\begin{classdesc}{Charset}{\optional{input_charset}}
|
||||
Map character sets to their email properties.
|
||||
|
||||
This class provides information about the requirements imposed on
|
||||
email for a specific character set. It also provides convenience
|
||||
routines for converting between character sets, given the availability
|
||||
of the applicable codecs. Given a character set, it will do its best
|
||||
to provide information on how to use that character set in an email
|
||||
message in an RFC-compliant way.
|
||||
|
||||
Certain character sets must be encoded with quoted-printable or base64
|
||||
when used in email headers or bodies. Certain character sets must be
|
||||
converted outright, and are not allowed in email.
|
||||
|
||||
Optional \var{input_charset} is as described below. After being alias
|
||||
normalized it is also used as a lookup into the registry of character
|
||||
sets to find out the header encoding, body encoding, and output
|
||||
conversion codec to be used for the character set. For example, if
|
||||
\var{input_charset} is \code{iso-8859-1}, then headers and bodies will
|
||||
be encoded using quoted-printable and no output conversion codec is
|
||||
necessary. If \var{input_charset} is \code{euc-jp}, then headers will
|
||||
be encoded with base64, bodies will not be encoded, but output text
|
||||
will be converted from the \code{euc-jp} character set to the
|
||||
\code{iso-2022-jp} character set.
|
||||
\end{classdesc}
|
||||
|
||||
\class{Charset} instances have the following data attributes:
|
||||
|
||||
\begin{datadesc}{input_charset}
|
||||
The initial character set specified. Common aliases are converted to
|
||||
their \emph{official} email names (e.g. \code{latin_1} is converted to
|
||||
\code{iso-8859-1}). Defaults to 7-bit \code{us-ascii}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{header_encoding}
|
||||
If the character set must be encoded before it can be used in an
|
||||
email header, this attribute will be set to \code{Charset.QP} (for
|
||||
quoted-printable), \code{Charset.BASE64} (for base64 encoding), or
|
||||
\code{Charset.SHORTEST} for the shortest of QP or BASE64 encoding.
|
||||
Otherwise, it will be \code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{body_encoding}
|
||||
Same as \var{header_encoding}, but describes the encoding for the
|
||||
mail message's body, which indeed may be different than the header
|
||||
encoding. \code{Charset.SHORTEST} is not allowed for
|
||||
\var{body_encoding}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{output_charset}
|
||||
Some character sets must be converted before the can be used in
|
||||
email headers or bodies. If the \var{input_charset} is one of
|
||||
them, this attribute will contain the name of the character set
|
||||
output will be converted to. Otherwise, it will be \code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{input_codec}
|
||||
The name of the Python codec used to convert the \var{input_charset} to
|
||||
Unicode. If no conversion codec is necessary, this attribute will be
|
||||
\code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{output_codec}
|
||||
The name of the Python codec used to convert Unicode to the
|
||||
\var{output_charset}. If no conversion codec is necessary, this
|
||||
attribute will have the same value as the \var{input_codec}.
|
||||
\end{datadesc}
|
||||
|
||||
\class{Charset} instances also have the following methods:
|
||||
|
||||
\begin{methoddesc}[Charset]{get_body_encoding}{}
|
||||
Return the content transfer encoding used for body encoding.
|
||||
|
||||
This is either the string \samp{quoted-printable} or \samp{base64}
|
||||
depending on the encoding used, or it is a function, in which case you
|
||||
should call the function with a single argument, the Message object
|
||||
being encoded. The function should then set the
|
||||
\mailheader{Content-Transfer-Encoding} header itself to whatever is
|
||||
appropriate.
|
||||
|
||||
Returns the string \samp{quoted-printable} if
|
||||
\var{body_encoding} is \code{QP}, returns the string
|
||||
\samp{base64} if \var{body_encoding} is \code{BASE64}, and returns the
|
||||
string \samp{7bit} otherwise.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{convert}{s}
|
||||
Convert the string \var{s} from the \var{input_codec} to the
|
||||
\var{output_codec}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{to_splittable}{s}
|
||||
Convert a possibly multibyte string to a safely splittable format.
|
||||
\var{s} is the string to split.
|
||||
|
||||
Uses the \var{input_codec} to try and convert the string to Unicode,
|
||||
so it can be safely split on character boundaries (even for multibyte
|
||||
characters).
|
||||
|
||||
Returns the string as-is if it isn't known how to convert \var{s} to
|
||||
Unicode with the \var{input_charset}.
|
||||
|
||||
Characters that could not be converted to Unicode will be replaced
|
||||
with the Unicode replacement character \character{U+FFFD}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{from_splittable}{ustr\optional{, to_output}}
|
||||
Convert a splittable string back into an encoded string. \var{ustr}
|
||||
is a Unicode string to ``unsplit''.
|
||||
|
||||
This method uses the proper codec to try and convert the string from
|
||||
Unicode back into an encoded format. Return the string as-is if it is
|
||||
not Unicode, or if it could not be converted from Unicode.
|
||||
|
||||
Characters that could not be converted from Unicode will be replaced
|
||||
with an appropriate character (usually \character{?}).
|
||||
|
||||
If \var{to_output} is \code{True} (the default), uses
|
||||
\var{output_codec} to convert to an
|
||||
encoded format. If \var{to_output} is \code{False}, it uses
|
||||
\var{input_codec}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{get_output_charset}{}
|
||||
Return the output character set.
|
||||
|
||||
This is the \var{output_charset} attribute if that is not \code{None},
|
||||
otherwise it is \var{input_charset}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{encoded_header_len}{}
|
||||
Return the length of the encoded header string, properly calculating
|
||||
for quoted-printable or base64 encoding.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{header_encode}{s\optional{, convert}}
|
||||
Header-encode the string \var{s}.
|
||||
|
||||
If \var{convert} is \code{True}, the string will be converted from the
|
||||
input charset to the output charset automatically. This is not useful
|
||||
for multibyte character sets, which have line length issues (multibyte
|
||||
characters must be split on a character, not a byte boundary); use the
|
||||
higher-level \class{Header} class to deal with these issues (see
|
||||
\refmodule{email.Header}). \var{convert} defaults to \code{False}.
|
||||
|
||||
The type of encoding (base64 or quoted-printable) will be based on
|
||||
the \var{header_encoding} attribute.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{body_encode}{s\optional{, convert}}
|
||||
Body-encode the string \var{s}.
|
||||
|
||||
If \var{convert} is \code{True} (the default), the string will be
|
||||
converted from the input charset to output charset automatically.
|
||||
Unlike \method{header_encode()}, there are no issues with byte
|
||||
boundaries and multibyte charsets in email bodies, so this is usually
|
||||
pretty safe.
|
||||
|
||||
The type of encoding (base64 or quoted-printable) will be based on
|
||||
the \var{body_encoding} attribute.
|
||||
\end{methoddesc}
|
||||
|
||||
The \class{Charset} class also provides a number of methods to support
|
||||
standard operations and built-in functions.
|
||||
|
||||
\begin{methoddesc}[Charset]{__str__}{}
|
||||
Returns \var{input_charset} as a string coerced to lower case.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Charset]{__eq__}{other}
|
||||
This method allows you to compare two \class{Charset} instances for equality.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Header]{__ne__}{other}
|
||||
This method allows you to compare two \class{Charset} instances for inequality.
|
||||
\end{methoddesc}
|
||||
|
||||
The \module{email.Charset} module also provides the following
|
||||
functions for adding new entries to the global character set, alias,
|
||||
and codec registries:
|
||||
|
||||
\begin{funcdesc}{add_charset}{charset\optional{, header_enc\optional{,
|
||||
body_enc\optional{, output_charset}}}}
|
||||
Add character properties to the global registry.
|
||||
|
||||
\var{charset} is the input character set, and must be the canonical
|
||||
name of a character set.
|
||||
|
||||
Optional \var{header_enc} and \var{body_enc} is either
|
||||
\code{Charset.QP} for quoted-printable, \code{Charset.BASE64} for
|
||||
base64 encoding, \code{Charset.SHORTEST} for the shortest of qp or
|
||||
base64 encoding, or \code{None} for no encoding. \code{SHORTEST} is
|
||||
only valid for \var{header_enc}. It describes how message headers and
|
||||
message bodies in the input charset are to be encoded. Default is no
|
||||
encoding.
|
||||
|
||||
Optional \var{output_charset} is the character set that the output
|
||||
should be in. Conversions will proceed from input charset, to
|
||||
Unicode, to the output charset when the method
|
||||
\method{Charset.convert()} is called. The default is to output in the
|
||||
same character set as the input.
|
||||
|
||||
Both \var{input_charset} and \var{output_charset} must have Unicode
|
||||
codec entries in the module's character set-to-codec mapping; use
|
||||
\function{add_codec(charset, codecname)} to add codecs the module does
|
||||
not know about. See the \refmodule{codecs} module's documentation for
|
||||
more information.
|
||||
|
||||
The global character set registry is kept in the module global
|
||||
dictionary \code{CHARSETS}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{add_alias}{alias, canonical}
|
||||
Add a character set alias. \var{alias} is the alias name,
|
||||
e.g. \code{latin-1}. \var{canonical} is the character set's canonical
|
||||
name, e.g. \code{iso-8859-1}.
|
||||
|
||||
The global charset alias registry is kept in the module global
|
||||
dictionary \code{ALIASES}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{add_codec}{charset, codecname}
|
||||
Add a codec that map characters in the given character set to and from
|
||||
Unicode.
|
||||
|
||||
\var{charset} is the canonical name of a character set.
|
||||
\var{codecname} is the name of a Python codec, as appropriate for the
|
||||
second argument to the \function{unicode()} built-in, or to the
|
||||
\method{encode()} method of a Unicode string.
|
||||
\end{funcdesc}
|
||||
|
|
|
@ -33,9 +33,9 @@ The constructor takes no arguments.
|
|||
\end{classdesc}
|
||||
|
||||
\begin{methoddesc}[Message]{as_string}{\optional{unixfrom}}
|
||||
Return the entire formatted message as a string. Optional
|
||||
\var{unixfrom}, when true, specifies to include the \emph{Unix-From}
|
||||
envelope header; it defaults to \code{False}.
|
||||
Return the entire message flatten as a string. When optional
|
||||
\var{unixfrom} is \code{True}, the envelope header is included in the
|
||||
returned string. \var{unixfrom} defaults to \code{False}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{__str__}{}
|
||||
|
@ -59,7 +59,7 @@ envelope header was never set.
|
|||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{attach}{payload}
|
||||
Add the given payload to the current payload, which must be
|
||||
Add the given \var{payload} to the current payload, which must be
|
||||
\code{None} or a list of \class{Message} objects before the call.
|
||||
After the call, the payload will always be a list of \class{Message}
|
||||
objects. If you want to set the payload to a scalar object (e.g. a
|
||||
|
@ -95,7 +95,7 @@ returned. The default for \var{decode} is \code{False}.
|
|||
\begin{methoddesc}[Message]{set_payload}{payload\optional{, charset}}
|
||||
Set the entire message object's payload to \var{payload}. It is the
|
||||
client's responsibility to ensure the payload invariants. Optional
|
||||
\var{charset} sets the message's default character set (see
|
||||
\var{charset} sets the message's default character set; see
|
||||
\method{set_charset()} for details.
|
||||
|
||||
\versionchanged[\var{charset} argument added]{2.2.2}
|
||||
|
@ -103,7 +103,7 @@ client's responsibility to ensure the payload invariants. Optional
|
|||
|
||||
\begin{methoddesc}[Message]{set_charset}{charset}
|
||||
Set the character set of the payload to \var{charset}, which can
|
||||
either be a \class{Charset} instance (see \refmodule{email.Charset}, a
|
||||
either be a \class{Charset} instance (see \refmodule{email.Charset}), a
|
||||
string naming a character set,
|
||||
or \code{None}. If it is a string, it will be converted to a
|
||||
\class{Charset} instance. If \var{charset} is \code{None}, the
|
||||
|
@ -128,14 +128,18 @@ Return the \class{Charset} instance associated with the message's payload.
|
|||
\end{methoddesc}
|
||||
|
||||
The following methods implement a mapping-like interface for accessing
|
||||
the message object's \rfc{2822} headers. Note that there are some
|
||||
the message's \rfc{2822} headers. Note that there are some
|
||||
semantic differences between these methods and a normal mapping
|
||||
(i.e. dictionary) interface. For example, in a dictionary there are
|
||||
no duplicate keys, but here there may be duplicate message headers. Also,
|
||||
in dictionaries there is no guaranteed order to the keys returned by
|
||||
\method{keys()}, but in a \class{Message} object, there is an explicit
|
||||
order. These semantic differences are intentional and are biased
|
||||
toward maximal convenience.
|
||||
\method{keys()}, but in a \class{Message} object, headers are always
|
||||
returned in the order they appeared in the original message, or were
|
||||
added to the message later. Any header deleted and then re-added are
|
||||
always appended to the end of the header list.
|
||||
|
||||
These semantic differences are intentional and are biased toward
|
||||
maximal convenience.
|
||||
|
||||
Note that in all cases, any envelope header present in the message is
|
||||
not included in the mapping interface.
|
||||
|
@ -175,8 +179,7 @@ fields.
|
|||
Note that this does \emph{not} overwrite or delete any existing header
|
||||
with the same name. If you want to ensure that the new header is the
|
||||
only one present in the message with field name
|
||||
\var{name}, first use \method{__delitem__()} to delete all named
|
||||
fields, e.g.:
|
||||
\var{name}, delete the field first, e.g.:
|
||||
|
||||
\begin{verbatim}
|
||||
del msg['subject']
|
||||
|
@ -196,27 +199,16 @@ otherwise return false.
|
|||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{keys}{}
|
||||
Return a list of all the message's header field names. These keys
|
||||
will be sorted in the order in which they appeared in the original
|
||||
message, or were added to the message and may contain
|
||||
duplicates. Any fields deleted and then subsequently re-added are
|
||||
always appended to the end of the header list.
|
||||
Return a list of all the message's header field names.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{values}{}
|
||||
Return a list of all the message's field values. These will be sorted
|
||||
in the order in which they appeared in the original message, or were
|
||||
added to the message, and may contain
|
||||
duplicates. Any fields deleted and then subsequently re-added are
|
||||
always appended to the end of the header list.
|
||||
Return a list of all the message's field values.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{items}{}
|
||||
Return a list of 2-tuples containing all the message's field headers
|
||||
and values. These will be sorted in the order in which they appeared
|
||||
in the original message, or were added to the message, and may contain
|
||||
duplicates. Any fields deleted and then subsequently re-added are
|
||||
always appended to the end of the header list.
|
||||
and values.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get}{name\optional{, failobj}}
|
||||
|
@ -228,11 +220,7 @@ if the named header is missing (defaults to \code{None}).
|
|||
Here are some additional useful methods:
|
||||
|
||||
\begin{methoddesc}[Message]{get_all}{name\optional{, failobj}}
|
||||
Return a list of all the values for the field named \var{name}. These
|
||||
will be sorted in the order in which they appeared in the original
|
||||
message, or were added to the message. Any fields deleted and then
|
||||
subsequently re-added are always appended to the end of the list.
|
||||
|
||||
Return a list of all the values for the field named \var{name}.
|
||||
If there are no such named headers in the message, \var{failobj} is
|
||||
returned (defaults to \code{None}).
|
||||
\end{methoddesc}
|
||||
|
@ -351,10 +339,10 @@ instead of \mailheader{Content-Type}.
|
|||
Parameter keys are always compared case insensitively. The return
|
||||
value can either be a string, or a 3-tuple if the parameter was
|
||||
\rfc{2231} encoded. When it's a 3-tuple, the elements of the value are of
|
||||
the form \samp{(CHARSET, LANGUAGE, VALUE)}, where \var{LANGUAGE} may
|
||||
the form \code{(CHARSET, LANGUAGE, VALUE)}, where \code{LANGUAGE} may
|
||||
be the empty string. Your application should be prepared to deal with
|
||||
3-tuple return values, which it can convert the parameter to a Unicode
|
||||
string like so:
|
||||
3-tuple return values, which it can convert to a Unicode string like
|
||||
so:
|
||||
|
||||
\begin{verbatim}
|
||||
param = msg.get_param('foo')
|
||||
|
@ -363,7 +351,7 @@ if isinstance(param, tuple):
|
|||
\end{verbatim}
|
||||
|
||||
In any case, the parameter value (either the returned string, or the
|
||||
\var{VALUE} item in the 3-tuple) is always unquoted, unless
|
||||
\code{VALUE} item in the 3-tuple) is always unquoted, unless
|
||||
\var{unquote} is set to \code{False}.
|
||||
|
||||
\versionchanged[\var{unquote} argument added, and 3-tuple return value
|
||||
|
@ -398,7 +386,7 @@ Remove the given parameter completely from the
|
|||
\mailheader{Content-Type} header. The header will be re-written in
|
||||
place without the parameter or its value. All values will be quoted
|
||||
as necessary unless \var{requote} is \code{False} (the default is
|
||||
\code{True}). Optional \var{header} specifies an alterative to
|
||||
\code{True}). Optional \var{header} specifies an alternative to
|
||||
\mailheader{Content-Type}.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
|
@ -417,8 +405,8 @@ leaves the existing header's quoting as is, otherwise the parameters
|
|||
will be quoted (the default).
|
||||
|
||||
An alternative header can be specified in the \var{header} argument.
|
||||
When the \mailheader{Content-Type} header is set, we'll always also
|
||||
add a \mailheader{MIME-Version} header.
|
||||
When the \mailheader{Content-Type} header is set a
|
||||
\mailheader{MIME-Version} header is also added.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
@ -440,11 +428,10 @@ returned string will always be unquoted as per
|
|||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{set_boundary}{boundary}
|
||||
Set the \code{boundary} parameter of the \mailheader{Content-Type} header
|
||||
to \var{boundary}. \method{set_boundary()} will always quote
|
||||
\var{boundary} so you should not quote it yourself. A
|
||||
\exception{HeaderParseError} is raised if the message object has no
|
||||
\mailheader{Content-Type} header.
|
||||
Set the \code{boundary} parameter of the \mailheader{Content-Type}
|
||||
header to \var{boundary}. \method{set_boundary()} will always quote
|
||||
\var{boundary} if necessary. A \exception{HeaderParseError} is raised
|
||||
if the message object has no \mailheader{Content-Type} header.
|
||||
|
||||
Note that using this method is subtly different than deleting the old
|
||||
\mailheader{Content-Type} header and adding a new one with the new boundary
|
||||
|
@ -459,9 +446,9 @@ Return the \code{charset} parameter of the \mailheader{Content-Type}
|
|||
header. If there is no \mailheader{Content-Type} header, or if that
|
||||
header has no \code{charset} parameter, \var{failobj} is returned.
|
||||
|
||||
Note that this method differs from \method{get_charset} which returns
|
||||
the \class{Charset} instance for the default encoding of the message
|
||||
body.
|
||||
Note that this method differs from \method{get_charset()} which
|
||||
returns the \class{Charset} instance for the default encoding of the
|
||||
message body.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
@ -484,15 +471,15 @@ will be \var{failobj}.
|
|||
The \method{walk()} method is an all-purpose generator which can be
|
||||
used to iterate over all the parts and subparts of a message object
|
||||
tree, in depth-first traversal order. You will typically use
|
||||
\method{walk()} as the iterator in a \code{for ... in} loop; each
|
||||
\method{walk()} as the iterator in a \code{for} loop; each
|
||||
iteration returns the next subpart.
|
||||
|
||||
Here's an example that prints the MIME type of every part of a message
|
||||
object tree:
|
||||
Here's an example that prints the MIME type of every part of a
|
||||
multipart message structure:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> for part in msg.walk():
|
||||
>>> print part.get_type('text/plain')
|
||||
>>> print part.get_content_type()
|
||||
multipart/report
|
||||
text/plain
|
||||
message/delivery-status
|
||||
|
|
|
@ -1,10 +1,10 @@
|
|||
Ordinarily, you get a message object structure by passing a file or
|
||||
some text to a parser, which parses the text and returns the root of
|
||||
the message object structure. However you can also build a complete
|
||||
object structure from scratch, or even individual \class{Message}
|
||||
objects by hand. In fact, you can also take an existing structure and
|
||||
add new \class{Message} objects, move them around, etc. This makes a
|
||||
very convenient interface for slicing-and-dicing MIME messages.
|
||||
some text to a parser, which parses the text and returns the root
|
||||
message object. However you can also build a complete message
|
||||
structure from scratch, or even individual \class{Message} objects by
|
||||
hand. In fact, you can also take an existing structure and add new
|
||||
\class{Message} objects, move them around, etc. This makes a very
|
||||
convenient interface for slicing-and-dicing MIME messages.
|
||||
|
||||
You can create a new object structure by creating \class{Message}
|
||||
instances, adding attachments and all the appropriate headers manually.
|
||||
|
@ -99,7 +99,7 @@ callable takes one argument, which is the \class{MIMEAudio} instance.
|
|||
It should use \method{get_payload()} and \method{set_payload()} to
|
||||
change the payload to encoded form. It should also add any
|
||||
\mailheader{Content-Transfer-Encoding} or other headers to the message
|
||||
object as necessary. The default encoding is \emph{Base64}. See the
|
||||
object as necessary. The default encoding is base64. See the
|
||||
\refmodule{email.Encoders} module for a list of the built-in encoders.
|
||||
|
||||
\var{_params} are passed straight through to the base class constructor.
|
||||
|
@ -124,7 +124,7 @@ callable takes one argument, which is the \class{MIMEImage} instance.
|
|||
It should use \method{get_payload()} and \method{set_payload()} to
|
||||
change the payload to encoded form. It should also add any
|
||||
\mailheader{Content-Transfer-Encoding} or other headers to the message
|
||||
object as necessary. The default encoding is \emph{Base64}. See the
|
||||
object as necessary. The default encoding is base64. See the
|
||||
\refmodule{email.Encoders} module for a list of the built-in encoders.
|
||||
|
||||
\var{_params} are passed straight through to the \class{MIMEBase}
|
||||
|
|
|
@ -54,7 +54,7 @@ should be performed. Normally, when things like MIME terminating
|
|||
boundaries are missing, or when messages contain other formatting
|
||||
problems, the \class{Parser} will raise a
|
||||
\exception{MessageParseError}. However, when lax parsing is enabled,
|
||||
the \class{Parser} will attempt to workaround such broken formatting
|
||||
the \class{Parser} will attempt to work around such broken formatting
|
||||
to produce a usable message structure (this doesn't mean
|
||||
\exception{MessageParseError}s are never raised; some ill-formatted
|
||||
messages just can't be parsed). The \var{strict} flag defaults to
|
||||
|
@ -73,14 +73,12 @@ support both the \method{readline()} and the \method{read()} methods
|
|||
on file-like objects.
|
||||
|
||||
The text contained in \var{fp} must be formatted as a block of \rfc{2822}
|
||||
style headers and header continuation lines, optionally preceeded by a
|
||||
style headers and header continuation lines, optionally preceded by a
|
||||
envelope header. The header block is terminated either by the
|
||||
end of the data or by a blank line. Following the header block is the
|
||||
body of the message (which may contain MIME-encoded subparts).
|
||||
|
||||
Optional \var{headersonly} is a flag specifying whether to stop
|
||||
parsing after reading the headers or not. The default is \code{False},
|
||||
meaning it parses the entire contents of the file.
|
||||
Optional \var{headersonly} is as with the \method{parse()} method.
|
||||
|
||||
\versionchanged[The \var{headersonly} flag was added]{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
@ -104,7 +102,7 @@ convenience. They are available in the top-level \module{email}
|
|||
package namespace.
|
||||
|
||||
\begin{funcdesc}{message_from_string}{s\optional{, _class\optional{, strict}}}
|
||||
Return a message object tree from a string. This is exactly
|
||||
Return a message object structure from a string. This is exactly
|
||||
equivalent to \code{Parser().parsestr(s)}. Optional \var{_class} and
|
||||
\var{strict} are interpreted as with the \class{Parser} class constructor.
|
||||
|
||||
|
@ -112,9 +110,10 @@ equivalent to \code{Parser().parsestr(s)}. Optional \var{_class} and
|
|||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{message_from_file}{fp\optional{, _class\optional{, strict}}}
|
||||
Return a message object tree from an open file object. This is exactly
|
||||
equivalent to \code{Parser().parse(fp)}. Optional \var{_class} and
|
||||
\var{strict} are interpreted as with the \class{Parser} class constructor.
|
||||
Return a message object structure tree from an open file object. This
|
||||
is exactly equivalent to \code{Parser().parse(fp)}. Optional
|
||||
\var{_class} and \var{strict} are interpreted as with the
|
||||
\class{Parser} class constructor.
|
||||
|
||||
\versionchanged[The \var{strict} flag was added]{2.2.2}
|
||||
\end{funcdesc}
|
||||
|
@ -138,9 +137,10 @@ Here are some notes on the parsing semantics:
|
|||
\method{get_payload()} method will return a string object.
|
||||
\item All \mimetype{multipart} type messages will be parsed as a
|
||||
container message object with a list of sub-message objects for
|
||||
their payload. These messages will return \code{True} for
|
||||
\method{is_multipart()} and their \method{get_payload()} method
|
||||
will return a list of \class{Message} instances.
|
||||
their payload. The outer container message will return
|
||||
\code{True} for \method{is_multipart()} and their
|
||||
\method{get_payload()} method will return the list of
|
||||
\class{Message} subparts.
|
||||
\item Most messages with a content type of \mimetype{message/*}
|
||||
(e.g. \mimetype{message/deliver-status} and
|
||||
\mimetype{message/rfc822}) will also be parsed as container
|
||||
|
|
|
@ -6,7 +6,7 @@ package.
|
|||
|
||||
\begin{funcdesc}{quote}{str}
|
||||
Return a new string with backslashes in \var{str} replaced by two
|
||||
backslashes and double quotes replaced by backslash-double quote.
|
||||
backslashes, and double quotes replaced by backslash-double quote.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{unquote}{str}
|
||||
|
@ -85,7 +85,7 @@ common use.
|
|||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{formatdate}{\optional{timeval\optional{, localtime}}}
|
||||
Returns a date string as per Internet standard \rfc{2822}, e.g.:
|
||||
Returns a date string as per \rfc{2822}, e.g.:
|
||||
|
||||
\begin{verbatim}
|
||||
Fri, 09 Nov 2001 01:08:47 -0000
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue