cpython/Lib/encodings/__init__.py
Guido van Rossum d59da4b432 Merged revisions 55407-55513 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/p3yk

................
  r55413 | fred.drake | 2007-05-17 12:30:10 -0700 (Thu, 17 May 2007) | 1 line

  fix argument name in documentation; match the implementation
................
  r55430 | jack.diederich | 2007-05-18 06:39:59 -0700 (Fri, 18 May 2007) | 1 line

  Implements class decorators, PEP 3129.
................
  r55432 | guido.van.rossum | 2007-05-18 08:09:41 -0700 (Fri, 18 May 2007) | 2 lines

  obsubmit.
................
  r55434 | guido.van.rossum | 2007-05-18 09:39:10 -0700 (Fri, 18 May 2007) | 3 lines

  Fix bug in test_inspect.  (I presume this is how it should be fixed;
  Jack Diedrich, please verify.)
................
  r55460 | brett.cannon | 2007-05-20 00:31:57 -0700 (Sun, 20 May 2007) | 4 lines

  Remove the imageop module.  With imgfile already removed in Python 3.0 and
  rgbimg gone in Python 2.6 the unit tests themselves were made worthless.  Plus
  third-party libraries perform the same function much better.
................
  r55469 | neal.norwitz | 2007-05-20 11:28:20 -0700 (Sun, 20 May 2007) | 118 lines

  Merged revisions 55324-55467 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r55348 | georg.brandl | 2007-05-15 13:19:34 -0700 (Tue, 15 May 2007) | 4 lines

    HTML-escape the plain traceback in cgitb's HTML output, to prevent
    the traceback inadvertently or maliciously closing the comment and
    injecting HTML into the error page.
  ........
    r55372 | neal.norwitz | 2007-05-15 21:33:50 -0700 (Tue, 15 May 2007) | 6 lines

    Port rev 55353 from Guido:
    Add what looks like a necessary call to PyErr_NoMemory() when PyMem_MALLOC()
    fails.

    Will backport.
  ........
    r55377 | neal.norwitz | 2007-05-15 22:06:33 -0700 (Tue, 15 May 2007) | 1 line

    Mention removal of some directories for obsolete platforms
  ........
    r55380 | brett.cannon | 2007-05-15 22:50:03 -0700 (Tue, 15 May 2007) | 2 lines

    Change the maintainer of the BeOS port.
  ........
    r55383 | georg.brandl | 2007-05-16 06:44:18 -0700 (Wed, 16 May 2007) | 2 lines

    Bug #1719995: don't use deprecated method in sets example.
  ........
    r55386 | neal.norwitz | 2007-05-16 13:05:11 -0700 (Wed, 16 May 2007) | 5 lines

    Fix bug in marshal where bad data would cause a segfault due to
    lack of an infinite recursion check.

    Contributed by Damien Miller at Google.
  ........
    r55389 | brett.cannon | 2007-05-16 15:42:29 -0700 (Wed, 16 May 2007) | 6 lines

    Remove the gopherlib module.  It has been raising a DeprecationWarning since
    Python 2.5.

    Also remove gopher support from urllib/urllib2.  As both imported gopherlib the
    usage of the support would have raised a DeprecationWarning.
  ........
    r55394 | raymond.hettinger | 2007-05-16 18:08:04 -0700 (Wed, 16 May 2007) | 1 line

    calendar.py gets no benefit from xrange() instead of range()
  ........
    r55395 | brett.cannon | 2007-05-16 19:02:56 -0700 (Wed, 16 May 2007) | 3 lines

    Complete deprecation of BaseException.message.  Some subclasses were directly
    accessing the message attribute instead of using the descriptor.
  ........
    r55396 | neal.norwitz | 2007-05-16 23:11:36 -0700 (Wed, 16 May 2007) | 4 lines

    Reduce the max stack depth to see if this fixes the segfaults on
    Windows and some other boxes.  If this is successful, this rev should
    be backported.  I'm not sure how close to the limit we should push this.
  ........
    r55397 | neal.norwitz | 2007-05-16 23:23:50 -0700 (Wed, 16 May 2007) | 4 lines

    Set the depth to something very small to try to determine if the
    crashes on Windows are really due to the stack size or possibly
    some other problem.
  ........
    r55398 | neal.norwitz | 2007-05-17 00:04:46 -0700 (Thu, 17 May 2007) | 4 lines

    Last try for tweaking the max stack depth.  5000 was the original value,
    4000 didn't work either.  1000 does work on Windows.  If 2000 works,
    that will hopefully be a reasonable balance.
  ........
    r55412 | fred.drake | 2007-05-17 12:29:58 -0700 (Thu, 17 May 2007) | 1 line

    fix argument name in documentation; match the implementation
  ........
    r55427 | neal.norwitz | 2007-05-17 22:47:16 -0700 (Thu, 17 May 2007) | 1 line

    Verify neither dumps or loads overflow the stack and segfault.
  ........
    r55446 | collin.winter | 2007-05-18 16:11:24 -0700 (Fri, 18 May 2007) | 1 line

    Backport PEP 3110's new 'except' syntax to 2.6.
  ........
    r55448 | raymond.hettinger | 2007-05-18 18:11:16 -0700 (Fri, 18 May 2007) | 1 line

    Improvements to NamedTuple's implementation, tests, and documentation
  ........
    r55449 | raymond.hettinger | 2007-05-18 18:50:11 -0700 (Fri, 18 May 2007) | 1 line

    Fix beginner mistake -- don't mix spaces and tabs.
  ........
    r55450 | neal.norwitz | 2007-05-18 20:48:47 -0700 (Fri, 18 May 2007) | 1 line

    Clear data so random memory does not get freed.  Will backport.
  ........
    r55452 | neal.norwitz | 2007-05-18 21:34:55 -0700 (Fri, 18 May 2007) | 3 lines

    Whoops, need to pay attention to those test failures.
    Move the clear to *before* the first use, not after.
  ........
    r55453 | neal.norwitz | 2007-05-18 21:35:52 -0700 (Fri, 18 May 2007) | 1 line

    Give some clue as to what happened if the test fails.
  ........
    r55455 | georg.brandl | 2007-05-19 11:09:26 -0700 (Sat, 19 May 2007) | 2 lines

    Fix docstring for add_package in site.py.
  ........
    r55458 | brett.cannon | 2007-05-20 00:09:50 -0700 (Sun, 20 May 2007) | 2 lines

    Remove the rgbimg module.  It has been deprecated since Python 2.5.
  ........
    r55465 | nick.coghlan | 2007-05-20 04:12:49 -0700 (Sun, 20 May 2007) | 1 line

    Fix typo in example (should be backported, but my maintenance branch is woefully out of date)
  ........
................
  r55472 | brett.cannon | 2007-05-20 12:06:18 -0700 (Sun, 20 May 2007) | 2 lines

  Remove imageop from the Windows build process.
................
  r55486 | neal.norwitz | 2007-05-20 23:59:52 -0700 (Sun, 20 May 2007) | 1 line

  Remove callable() builtin
................
  r55506 | neal.norwitz | 2007-05-22 00:43:29 -0700 (Tue, 22 May 2007) | 78 lines

  Merged revisions 55468-55505 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r55468 | neal.norwitz | 2007-05-20 11:06:27 -0700 (Sun, 20 May 2007) | 1 line

    rotor is long gone.
  ........
    r55470 | neal.norwitz | 2007-05-20 11:43:00 -0700 (Sun, 20 May 2007) | 1 line

    Update directories/files at the top-level.
  ........
    r55471 | brett.cannon | 2007-05-20 12:05:06 -0700 (Sun, 20 May 2007) | 2 lines

    Try to remove rgbimg from Windows builds.
  ........
    r55474 | brett.cannon | 2007-05-20 16:17:38 -0700 (Sun, 20 May 2007) | 4 lines

    Remove the macfs module.  This led to the deprecation of macostools.touched();
    it completely relied on macfs and is a no-op on OS X according to code
    comments.
  ........
    r55476 | brett.cannon | 2007-05-20 16:56:18 -0700 (Sun, 20 May 2007) | 3 lines

    Move imgfile import to the global namespace to trigger an import error ASAP to
    prevent creation of a test file.
  ........
    r55477 | brett.cannon | 2007-05-20 16:57:38 -0700 (Sun, 20 May 2007) | 3 lines

    Cause posixfile to raise a DeprecationWarning.  Documented as deprecated since
    Ptyhon 1.5.
  ........
    r55479 | andrew.kuchling | 2007-05-20 17:03:15 -0700 (Sun, 20 May 2007) | 1 line

    Note removed modules
  ........
    r55481 | martin.v.loewis | 2007-05-20 21:35:47 -0700 (Sun, 20 May 2007) | 2 lines

    Add Alexandre Vassalotti.
  ........
    r55482 | george.yoshida | 2007-05-20 21:41:21 -0700 (Sun, 20 May 2007) | 4 lines

    fix against r55474 [Remove the macfs module]

    Remove "libmacfs.tex" from Makefile.deps and mac/mac.tex.
  ........
    r55487 | raymond.hettinger | 2007-05-21 01:13:35 -0700 (Mon, 21 May 2007) | 1 line

    Replace assertion with straight error-checking.
  ........
    r55489 | raymond.hettinger | 2007-05-21 09:40:10 -0700 (Mon, 21 May 2007) | 1 line

    Allow all alphanumeric and underscores in type and field names.
  ........
    r55490 | facundo.batista | 2007-05-21 10:32:32 -0700 (Mon, 21 May 2007) | 5 lines


    Added timeout support to HTTPSConnection, through the
    socket.create_connection function. Also added a small
    test for this, and updated NEWS file.
  ........
    r55495 | georg.brandl | 2007-05-21 13:34:16 -0700 (Mon, 21 May 2007) | 2 lines

    Patch #1686487: you can now pass any mapping after '**' in function calls.
  ........
    r55502 | neal.norwitz | 2007-05-21 23:03:36 -0700 (Mon, 21 May 2007) | 1 line

    Document new params to HTTPSConnection
  ........
    r55504 | neal.norwitz | 2007-05-22 00:16:10 -0700 (Tue, 22 May 2007) | 1 line

    Stop using METH_OLDARGS
  ........
    r55505 | neal.norwitz | 2007-05-22 00:16:44 -0700 (Tue, 22 May 2007) | 1 line

    Stop using METH_OLDARGS implicitly
  ........
................
2007-05-22 18:11:13 +00:00

156 lines
5.5 KiB
Python

""" Standard "encodings" Package
Standard Python encoding modules are stored in this package
directory.
Codec modules must have names corresponding to normalized encoding
names as defined in the normalize_encoding() function below, e.g.
'utf-8' must be implemented by the module 'utf_8.py'.
Each codec module must export the following interface:
* getregentry() -> codecs.CodecInfo object
The getregentry() API must a CodecInfo object with encoder, decoder,
incrementalencoder, incrementaldecoder, streamwriter and streamreader
atttributes which adhere to the Python Codec Interface Standard.
In addition, a module may optionally also define the following
APIs which are then used by the package's codec search function:
* getaliases() -> sequence of encoding name strings to use as aliases
Alias names returned by getaliases() must be normalized encoding
names as defined by normalize_encoding().
Written by Marc-Andre Lemburg (mal@lemburg.com).
(c) Copyright CNRI, All Rights Reserved. NO WARRANTY.
"""#"
import codecs
from . import aliases
_cache = {}
_unknown = '--unknown--'
_import_tail = ['*']
_norm_encoding_map = (' . '
'0123456789 ABCDEFGHIJKLMNOPQRSTUVWXYZ '
' abcdefghijklmnopqrstuvwxyz '
' '
' '
' ')
_aliases = aliases.aliases
class CodecRegistryError(LookupError, SystemError):
pass
def normalize_encoding(encoding):
""" Normalize an encoding name.
Normalization works as follows: all non-alphanumeric
characters except the dot used for Python package names are
collapsed and replaced with a single underscore, e.g. ' -;#'
becomes '_'. Leading and trailing underscores are removed.
Note that encoding names should be ASCII only; if they do use
non-ASCII characters, these must be Latin-1 compatible.
"""
# Make sure we have an 8-bit string, because .translate() works
# differently for Unicode strings.
if isinstance(encoding, str):
# Note that .encode('latin-1') does *not* use the codec
# registry, so this call doesn't recurse. (See unicodeobject.c
# PyUnicode_AsEncodedString() for details)
encoding = encoding.encode('latin-1')
return '_'.join(encoding.translate(_norm_encoding_map).split())
def search_function(encoding):
# Cache lookup
entry = _cache.get(encoding, _unknown)
if entry is not _unknown:
return entry
# Import the module:
#
# First try to find an alias for the normalized encoding
# name and lookup the module using the aliased name, then try to
# lookup the module using the standard import scheme, i.e. first
# try in the encodings package, then at top-level.
#
norm_encoding = normalize_encoding(encoding)
aliased_encoding = _aliases.get(norm_encoding) or \
_aliases.get(norm_encoding.replace('.', '_'))
if aliased_encoding is not None:
modnames = [aliased_encoding,
norm_encoding]
else:
modnames = [norm_encoding]
for modname in modnames:
if not modname or '.' in modname:
continue
try:
# Import is absolute to prevent the possibly malicious import of a
# module with side-effects that is not in the 'encodings' package.
mod = __import__('encodings.' + modname, fromlist=_import_tail,
level=0)
except ImportError:
pass
else:
break
else:
mod = None
try:
getregentry = mod.getregentry
except AttributeError:
# Not a codec module
mod = None
if mod is None:
# Cache misses
_cache[encoding] = None
return None
# Now ask the module for the registry entry
entry = getregentry()
if not isinstance(entry, codecs.CodecInfo):
if not 4 <= len(entry) <= 7:
raise CodecRegistryError,\
'module "%s" (%s) failed to register' % \
(mod.__name__, mod.__file__)
if not hasattr(entry[0], '__call__') or \
not hasattr(entry[1], '__call__') or \
(entry[2] is not None and not hasattr(entry[2], '__call__')) or \
(entry[3] is not None and not hasattr(entry[3], '__call__')) or \
(len(entry) > 4 and entry[4] is not None and not hasattr(entry[4], '__call__')) or \
(len(entry) > 5 and entry[5] is not None and not hasattr(entry[5], '__call__')):
raise CodecRegistryError,\
'incompatible codecs in module "%s" (%s)' % \
(mod.__name__, mod.__file__)
if len(entry)<7 or entry[6] is None:
entry += (None,)*(6-len(entry)) + (mod.__name__.split(".", 1)[1],)
entry = codecs.CodecInfo(*entry)
# Cache the codec registry entry
_cache[encoding] = entry
# Register its aliases (without overwriting previously registered
# aliases)
try:
codecaliases = mod.getaliases()
except AttributeError:
pass
else:
for alias in codecaliases:
if alias not in _aliases:
_aliases[alias] = modname
# Return the registry entry
return entry
# Register the search_function in the Python codec registry
codecs.register(search_function)