Commit graph

127 commits

Author SHA1 Message Date
Walter Dörwald
007f8dfde2 Bug #1245379: Add "unicode-1-1-utf-7" as an alias for "utf-7" as specified
by RFC 1642.
2005-10-09 19:42:27 +00:00
Neal Norwitz
4ce69a5b06 No need to import exceptions, they are builtins 2005-09-01 00:45:28 +00:00
Martin v. Löwis
8b59514e57 Make IDNA return an empty string when the input is empty. Fixes #1163178.
Will backport to 2.4.
2005-08-25 11:03:38 +00:00
Walter Dörwald
729c31f5c3 Reset internal buffers when seek() is called. This fixes SF bug #1156259. 2005-03-14 19:06:30 +00:00
Walter Dörwald
e1a0391b49 Fix wrong variable name. 2004-12-29 13:11:10 +00:00
Marc-André Lemburg
9ab8818c87 Rearranged mappings to value sorting order. 2004-12-10 21:54:35 +00:00
Walter Dörwald
69652035bc SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now support
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
2004-09-07 20:24:22 +00:00
Tim Peters
d1b7827216 Whitespace normalization. 2004-08-07 06:03:09 +00:00
Marc-André Lemburg
c759f070ef Added new codecs and aliases for ISO_8859-11, ISO_8859-16 and
TIS-620.

Closes SF bug #1001895: Adding missing ISO 8859 codecs, especially Thai.
2004-08-05 12:43:30 +00:00
Tim Peters
c0cbc8611b Whitespace normalization. 2004-07-31 21:17:37 +00:00
Marc-André Lemburg
17b6d28c64 New codec: [ 996067 ] hp-roman8 codec 2004-07-28 15:37:54 +00:00
Marc-André Lemburg
cd8a4cb3d3 Added new codec hp-roman8 submitted as patch [ 996067 ] hp-roman8 codec. 2004-07-28 15:35:29 +00:00
Hye-Shik Chang
2bb146f2f4 Bring CJKCodecs 1.1 into trunk. This completely reorganizes source
and installed layouts to make maintenance simple and easy.  And it
also adds four new codecs; big5hkscs, euc-jis-2004, shift-jis-2004
and iso2022-jp-2004.
2004-07-18 03:06:29 +00:00
Tim Peters
4e0e1b6a54 Whitespace normalization. 2004-07-07 20:54:48 +00:00
Martin v. Löwis
708b4dacf4 Convert input to a string object. Fixes #909230.
Backported 2.3.
2004-03-23 23:40:36 +00:00
Hye-Shik Chang
5c5316f111 Add a new unicode codec: ptcp154 (Kazakh) 2004-03-19 08:06:07 +00:00
Marc-André Lemburg
361d66de5d Fix wrong character mapping in koi8_u: SF bug #902501. 2004-02-23 09:00:43 +00:00
Marc-André Lemburg
c83dddf7fe Let the default encodings search function lookup aliases before trying the codec import. This allows applications to install codecs which override (non-special-cased) builtin codecs. 2004-01-20 09:40:14 +00:00
Marc-André Lemburg
5c94d33077 Add some more code page aliases needed for completeness. 2004-01-20 09:38:52 +00:00
Hye-Shik Chang
b619e4b36c Fix a typo: s/iso_3022/iso2022/ 2004-01-20 09:33:30 +00:00
Hye-Shik Chang
3e2a306920 Add CJK codecs support as discussed on python-dev. (SF #873597)
Several style fixes are suggested by Martin v. Loewis and
Marc-Andre Lemburg. Thanks!
2004-01-17 14:29:29 +00:00
Raymond Hettinger
0ad142aba0 Revert previous change. MAL preferred the old version. 2003-12-01 13:26:46 +00:00
Raymond Hettinger
a45517065a Simplifed the code. 2003-12-01 10:41:02 +00:00
Raymond Hettinger
9edae346dd Fix typo in the comments. 2003-09-24 03:57:36 +00:00
Raymond Hettinger
9a80c5dbc4 Added codec for bz2 compression. 2003-09-23 20:21:01 +00:00
Martin v. Löwis
0d8e16c7ad Support trailing dots in DNS names. Fixes #782510. Will backport to 2.3. 2003-08-05 06:19:47 +00:00
Skip Montanaro
5d6ceb4aae more generic reference to python interpreter 2003-07-22 14:37:42 +00:00
Marc-André Lemburg
2820125935 Remove usage of re module from encodings package search function. 2003-05-16 17:07:51 +00:00
Tim Peters
0eadaac7dc Whitespace normalization. 2003-04-24 16:02:54 +00:00
Martin v. Löwis
2548c730c1 Implement IDNA (Internationalized Domain Names in Applications). 2003-04-18 10:39:54 +00:00
Martin v. Löwis
7fb697b5d2 Revert Patch #670715: iconv support. 2003-04-03 04:49:12 +00:00
Neal Norwitz
6156a2d07c Handle iconv initialization erorrs 2003-02-28 20:00:42 +00:00
Martin v. Löwis
9789aefa61 Patch #670715: Universal Unicode Codec for POSIX iconv. 2003-01-26 11:30:36 +00:00
Tim Peters
6578dc925f Whitespace normalization. 2002-12-24 18:31:27 +00:00
Neal Norwitz
d8407a7031 Add new encoding for Ukrainian Cyrillic 2002-10-17 22:15:33 +00:00
Guido van Rossum
c8c6065231 When looking for an alias, first look for the normalized name (which
still may contain dots), then if that doesn't exist look for the name
with dots replaced by underscores.  This is a little more forgiving.
2002-10-04 20:49:05 +00:00
Marc-André Lemburg
8dc5ff2e5a Undo the removal. Guido mentioned that the encoding name is in active
by some email headers.
2002-10-04 16:30:42 +00:00
Marc-André Lemburg
68fc27385d Remove unneeded alias. 2002-10-04 15:57:03 +00:00
Marc-André Lemburg
a40ea75625 Fix doc-string. 2002-10-04 11:58:24 +00:00
Marc-André Lemburg
9d158bb66f Adapt lookup names to new more general encoding name normalization
scheme.
2002-10-04 11:51:39 +00:00
Marc-André Lemburg
7012673d67 Extending the encoding name normalization to handle more non-alphanumeric
characters.
2002-10-04 11:45:38 +00:00
Guido van Rossum
479f3d3d2a Oops, must convert hyphens to underscores in keys of aliases dict. 2002-09-26 20:08:23 +00:00
Guido van Rossum
b7a88e533d Add yet another alias for ASCII found in the field. Will backport to
2.2.2.
2002-09-25 16:44:34 +00:00
Tim Peters
280488b9a3 Whitespace normalization. 2002-08-23 18:19:30 +00:00
Martin v. Löwis
8a8da798a5 Patch #505705: Remove eval in pickle and cPickle. 2002-08-14 07:46:28 +00:00
Tim Peters
469cdad822 Whitespace normalization. 2002-08-08 20:19:19 +00:00
Martin v. Löwis
b9e0764d8b Revert #571603 since it is ok to import codecs that are not subdirectories
of encodings. Skip modules that don't have a getregentry function.
2002-07-29 14:05:24 +00:00
Martin v. Löwis
fc4c24c142 Patch #571603: Refer to encodings package explicitly. 2002-07-28 11:31:33 +00:00
Marc-André Lemburg
a83ffa89f2 Palm OS encoding from Sjoerd Mullender 2002-07-12 14:36:22 +00:00
Marc-André Lemburg
3ccb09cba3 Fix for bug #222395: UTF-16 et al. don't handle .readline().
They now raise an NotImplementedError to hint to the truth ;-)
2002-04-05 12:12:00 +00:00