Commit graph

180 commits

Author SHA1 Message Date
Barry Warsaw
057b8428d0 Docstring consistency with the updated .tex files. 2002-09-30 20:07:22 +00:00
Barry Warsaw
42d1d3edc0 __contains__(): Change the second argument to `name' for consistency.
I seriously doubt this will break any deployed code.

Docstring consistency with the updated .tex files.
2002-09-30 18:17:35 +00:00
Barry Warsaw
174aa49a88 With help from Martin v. Loewis, clarification is added for the
semantics of header chunks using byte and Unicode strings.
Specifically,

append(): When the given string is a byte string, charset (whether
specified explicitly in the argument list or implicitly via the
constructor default) is the encoding of the byte string, and a
UnicodeError will be raised if the string cannot be decoded with that
charset.  If s is a Unicode string, then charset is a hint specifying
the character set of the characters in the string.  In this case, when
producing an RFC 2822 compliant header using RFC 2047 rules, the
Unicode string will be encoded using the following charsets in order:
us-ascii, the charset hint, utf-8.

__init__(): Use the global USASCII Charset instance when the charset
argument is None.  Also, clarification in the docstring.

Also, use True/False where appropriate.
2002-09-30 15:51:31 +00:00
Barry Warsaw
d20b66537c The ansi_x3.4_1968 encoding is an alias for ascii, but isn't known in
Python 2.1.3.  However it's required by the email tests suite, so poke
it into the encodings aliases if it's missing.  The is apparently the
approved API for doing so.

Now we can remove the hexversion shortcircuits in the test suite.
2002-09-30 15:23:17 +00:00
Barry Warsaw
d63071b05f Make the tests pass under Python 2.1 but only by cheating. Python 2.1
doesn't know about the ansi-x3.4-1968 charset so skip two tests that
rely on that (msg_32.txt and msg_33.txt).
2002-09-28 21:22:52 +00:00
Barry Warsaw
eecdc742f5 Add a test for SHORTEST encoding of utf-8 headers, and also update
some of the test values which change because of this.
2002-09-28 21:04:19 +00:00
Barry Warsaw
c202d93e0e Use True/False everywhere, and other code cleanups. 2002-09-28 21:02:51 +00:00
Barry Warsaw
f776e6922c Code cleanup and add docstrings. 2002-09-28 20:52:26 +00:00
Barry Warsaw
5bdb2bee37 Use True/False everywhere, and other code cleanups. 2002-09-28 20:49:57 +00:00
Barry Warsaw
e03e8f09eb Use True/False everywhere. 2002-09-28 20:44:58 +00:00
Barry Warsaw
4ece778bbc is_multipart(): Use isinstance() instead of type equality. 2002-09-28 20:41:39 +00:00
Barry Warsaw
c494549566 Docstring and code cleanups, e.g. use True/False everywhere. 2002-09-28 20:40:25 +00:00
Barry Warsaw
bba6b0243e __init__(): Minor code cleanup. 2002-09-28 20:27:28 +00:00
Barry Warsaw
5f253279d6 Add a pychecker suppression. 2002-09-28 20:25:15 +00:00
Barry Warsaw
56835dd961 Use True/False everywhere. 2002-09-28 18:04:55 +00:00
Barry Warsaw
5932c9bedd Added a feature suggested by Martin v Loewis, where a new header
encoding flag SHORTEST means to return the shortest encoding between
base64 and qp.  This is used for the header_enc for utf-8.  SHORTEST
isn't legal for body_enc.

Also some code cleanup:

- use True/False everywhere
- use == instead of `is' in a few places
- added _unicode() and make consistent the "is unicode" checks
- update docstrings
2002-09-28 17:47:56 +00:00
Barry Warsaw
09f7424f3a test_unicode_error(): Comment this test out, since we still have
controversy.
2002-09-26 17:21:53 +00:00
Barry Warsaw
9c74569ec9 Fixing some RFC 2231 related issues as reported in the Spambayes
project, and with assistance from Oleg Broytmann.  Specifically,
added some new tests to make sure we handle RFC 2231 encoded
parameters correctly.  Two new data files were added which contain RFC
2231 encoded parameters.
2002-09-26 17:21:02 +00:00
Barry Warsaw
15aefa94d0 Fixing some RFC 2231 related issues as reported in the Spambayes
project, and with assistance from Oleg Broytmann.  Specifically,

get_param(), get_params(): Document that these methods may return
parameter values that are either strings, or 3-tuples in the case of
RFC 2231 encoded parameters.  The application should be prepared to
deal with such return values.

get_boundary(): Be prepared to deal with RFC 2231 encoded boundary
parameters.  It makes little sense to have boundaries that are
anything but ascii, so if we get back a 3-tuple from get_param() we
will decode it into ascii and let any failures percolate up.

get_content_charset(): New method which treats the charset parameter
just like the boundary parameter in get_boundary().  Note that
"get_charset()" was already taken to return the default Charset
object.

get_charsets(): Rewrite to use get_content_charset().
2002-09-26 17:19:34 +00:00
Barry Warsaw
6f30a8ab62 __version__: Bump to 2.4
Move the imports of Parser and Message inside the
message_from_string() and message_from_file() functions.  This way
just "import email" won't suck in most of the submodules of the
package.

Note: this will break code that relied on "import email" giving you a
bunch of the submodules, but that was never documented and should not
have been relied on.
2002-09-25 22:07:50 +00:00
Barry Warsaw
40363b63f0 Open the test files in binary mode so the \r\n files won't cause
failures on Windows.  Closes SF bug # 609988.
2002-09-18 22:17:57 +00:00
Barry Warsaw
78170048f9 Bump to 2.3.1 to pick up the missing file. 2002-09-12 03:44:50 +00:00
Barry Warsaw
fbcde75c70 get_payload(): Document that calling it with no arguments returns a
reference to the payload.
2002-09-11 14:11:35 +00:00
Barry Warsaw
bc6edac8df test_utils_quote_unquote(): Test for unquote() properly
de-backslash-ifying.
2002-09-11 02:31:24 +00:00
Barry Warsaw
184d55a897 rfc822.unquote() doesn't properly de-backslash-ify in Python prior to
2.3.  This patch (adapted from Quinn Dunkan's SF patch #573204) fixes
the problem and should get ported to rfc822.py.
2002-09-11 02:22:48 +00:00
Barry Warsaw
034b47acfe _parsebody(): Instead of raising a BoundaryError when no start
boundary could be found -- in a lax parser -- the entire body is
assigned to the message payload.
2002-09-10 16:14:56 +00:00
Barry Warsaw
b1c1de3805 Import _isstring() from the compatibility layer.
_handle_text(): Use _isstring() for stringiness test.

_handle_multipart(): Add a test before the ListType test, checking for
stringiness of the payload.  String payloads for multitypes means a
message with broken MIME chrome was parsed by a lax parser.  Instead
of raising a BoundaryError in those cases, the entire body is assigned
to the message payload (but since the content type is still
multipart/*, the Generator needs to be updated too).
2002-09-10 16:13:45 +00:00
Barry Warsaw
356afac41f _isstring(): Factor out "stringiness" test, e.g. for StringType or
UnicodeType, which is different between Python 2.1 and 2.2.
2002-09-10 16:09:06 +00:00
Barry Warsaw
45d9bde6c1 _ascii_split(): Don't lstrip continuation lines. Closes SF bug #601392. 2002-09-10 15:57:29 +00:00
Barry Warsaw
24d45df3f2 test_splitting_first_line_only_is_long(): New test for SF bug #601392,
broken wrapping of long ASCII headers.
2002-09-10 15:46:44 +00:00
Barry Warsaw
dad90c202a A sample message with broken MIME boundaries. 2002-09-10 15:43:30 +00:00
Barry Warsaw
e99e2f53e7 test_set_param(), test_del_param(): Test RFC 2231 encoding support by
Oleg Broytmann in SF patch #600096.  Whitespace normalized by Barry.
2002-09-06 03:56:26 +00:00
Barry Warsaw
3c25535dc8 _formatparam(), set_param(): RFC 2231 encoding support by Oleg
Broytmann in SF patch #600096.  Specifically, the former function now
encodes the triplets, while the latter adds optional charset and
language arguments.
2002-09-06 03:55:04 +00:00
Barry Warsaw
470288c54e test_mondo_message(): "binary" is not a legal content type, so with
the previous RFC 2045, $5.2 repair to get_content_type() this
subpart's type will now be text/plain.
2002-09-06 03:41:27 +00:00
Barry Warsaw
58fb61cce5 test_replace_header(): New test for Message.replace_header(). 2002-09-06 03:39:59 +00:00
Barry Warsaw
229727fa07 replace_header(): New method given by Skip Montanaro in SF patch
#601959.  Modified slightly by Barry (who added the KeyError in case
the header is missing.
2002-09-06 03:38:12 +00:00
Barry Warsaw
a4ce1cf34c _structure(): Use .get_content_type() 2002-09-01 21:04:43 +00:00
Barry Warsaw
1a1607546c Whitespace normalization. 2002-08-27 22:38:50 +00:00
Barry Warsaw
48b0d36b4d Typo 2002-08-27 22:34:44 +00:00
Tim Peters
280488b9a3 Whitespace normalization. 2002-08-23 18:19:30 +00:00
Barry Warsaw
4d5ef6aed6 Bump version number to 2.3 2002-08-20 14:51:34 +00:00
Barry Warsaw
3328136e3c Added tests for SF patch #597593, syntactically invalid Content-Type: headers. 2002-08-20 14:51:10 +00:00
Barry Warsaw
f36d804b3b get_content_type(), get_content_maintype(), get_content_subtype(): RFC
2045, section 5.2 states that if the Content-Type: header is
syntactically invalid, the default type should be text/plain.
Implement minimal sanity checking of the header -- it must have
exactly one slash in it.  This closes SF patch #597593 by Skip, but in
a different way.

Note that these methods used to raise ValueError for invalid ctypes,
but now they won't.
2002-08-20 14:50:09 +00:00
Barry Warsaw
dfea3b3963 _dispatch(): Use get_content_maintype() and get_content_subtype() to
get the MIME main and sub types, instead of getting the whole ctype
and splitting it here.   The two more specific methods now correctly
implement RFC 2045, section 5.2.
2002-08-20 14:47:30 +00:00
Barry Warsaw
b404bb7813 test_three_lines(): Test case reported by Andrew McNamara. Works in
email 2.2 but fails in email 1.0.
2002-08-20 12:54:07 +00:00
Barry Warsaw
9e4e050c59 Use full package paths in imports. 2002-07-23 20:35:58 +00:00
Barry Warsaw
10d0d595e0 Added a couple of more tests for Header charset handling. 2002-07-23 19:46:35 +00:00
Barry Warsaw
04f357cffe Get rid of relative imports in all unittests. Now anything that
imports e.g. test_support must do so using an absolute package name
such as "import test.test_support" or "from test import test_support".

This also updates the README in Lib/test, and gets rid of the
duplicate data dirctory in Lib/test/data (replaced by
Lib/email/test/data).

Now Tim and Jack can have at it. :)
2002-07-23 19:04:11 +00:00
Barry Warsaw
92825a9a52 append(): Bite the bullet and let charset be the string name of a
character set, which we'll convert to a Charset instance.  Sigh.
2002-07-23 06:08:10 +00:00
Barry Warsaw
15d3739446 make_header(): Watch out for charset is None, which decode_header()
will return as the charset if implicit us-ascii is used.
2002-07-23 04:29:54 +00:00