cpython/Lib/encodings/aliases.py
Guido van Rossum 9e896b37c7 Marc-Andre's third try at this bulk patch seems to work (except that
his copy of test_contains.py seems to be broken -- the lines he
deleted were already absent).  Checkin messages:


New Unicode support for int(), float(), complex() and long().

- new APIs PyInt_FromUnicode() and PyLong_FromUnicode()
- added support for Unicode to PyFloat_FromString()
- new encoding API PyUnicode_EncodeDecimal() which converts
  Unicode to a decimal char* string (used in the above new
  APIs)
- shortcuts for calls like int(<int object>) and float(<float obj>)
- tests for all of the above

Unicode compares and contains checks:
- comparing Unicode and non-string types now works; TypeErrors
  are masked, all other errors such as ValueError during
  Unicode coercion are passed through (note that PyUnicode_Compare
  does not implement the masking -- PyObject_Compare does this)
- contains now works for non-string types too; TypeErrors are
  masked and 0 returned; all other errors are passed through

Better testing support for the standard codecs.

Misc minor enhancements, such as an alias dbcs for the mbcs codec.

Changes:
- PyLong_FromString() now applies the same error checks as
  does PyInt_FromString(): trailing garbage is reported
  as error and not longer silently ignored. The only characters
  which may be trailing the digits are 'L' and 'l' -- these
  are still silently ignored.
- string.ato?() now directly interface to int(), long() and
  float(). The error strings are now a little different, but
  the type still remains the same. These functions are now
  ready to get declared obsolete ;-)
- PyNumber_Int() now also does a check for embedded NULL chars
  in the input string; PyNumber_Long() already did this (and
  still does)

Followed by:

Looks like I've gone a step too far there... (and test_contains.py
seem to have a bug too).

I've changed back to reporting all errors in PyUnicode_Contains()
and added a few more test cases to test_contains.py (plus corrected
the join() NameError).
2000-04-05 20:11:21 +00:00

60 lines
1.3 KiB
Python

""" Encoding Aliases Support
This module is used by the encodings package search function to
map encodings names to module names.
Note that the search function converts the encoding names to lower
case and replaces hyphens with underscores *before* performing the
lookup.
"""
aliases = {
# Latin-1
'latin': 'latin_1',
'latin1': 'latin_1',
# UTF-8
'utf': 'utf_8',
'utf8': 'utf_8',
'u8': 'utf_8',
# UTF-16
'utf16': 'utf_16',
'u16': 'utf_16',
'utf_16be': 'utf_16_be',
'utf_16le': 'utf_16_le',
'unicodebigunmarked': 'utf_16_be',
'unicodelittleunmarked': 'utf_16_le',
# ASCII
'us_ascii': 'ascii',
# ISO
'iso8859_1': 'latin_1',
'iso_8859_1': 'latin_1',
'iso_8859_10': 'iso8859_10',
'iso_8859_13': 'iso8859_13',
'iso_8859_14': 'iso8859_14',
'iso_8859_15': 'iso8859_15',
'iso_8859_2': 'iso8859_2',
'iso_8859_3': 'iso8859_3',
'iso_8859_4': 'iso8859_4',
'iso_8859_5': 'iso8859_5',
'iso_8859_6': 'iso8859_6',
'iso_8859_7': 'iso8859_7',
'iso_8859_8': 'iso8859_8',
'iso_8859_9': 'iso8859_9',
# Mac
'maccentraleurope': 'mac_latin2',
'maccyrillic': 'mac_cyrillic',
'macgreek': 'mac_greek',
'maciceland': 'mac_iceland',
'macroman': 'mac_roman',
'macturkish': 'mac_turkish',
# MBCS
'dbcs': 'mbcs',
}