closes bpo-36861: Update Unicode database to 12.1.0. (GH-13214)

Adds ㋿.
This commit is contained in:
Benjamin Peterson 2019-05-08 20:59:35 -07:00 committed by GitHub
parent 289f1f80ee
commit 3aca40d3cb
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
10 changed files with 15415 additions and 15411 deletions

View file

@ -351,7 +351,7 @@ Notes:
The numeric literals accepted include the digits ``0`` to ``9`` or any The numeric literals accepted include the digits ``0`` to ``9`` or any
Unicode equivalent (code points with the ``Nd`` property). Unicode equivalent (code points with the ``Nd`` property).
See http://www.unicode.org/Public/12.0.0/ucd/extracted/DerivedNumericType.txt See http://www.unicode.org/Public/12.1.0/ucd/extracted/DerivedNumericType.txt
for a complete list of code points with the ``Nd`` property. for a complete list of code points with the ``Nd`` property.

View file

@ -17,8 +17,8 @@
This module provides access to the Unicode Character Database (UCD) which This module provides access to the Unicode Character Database (UCD) which
defines character properties for all Unicode characters. The data contained in defines character properties for all Unicode characters. The data contained in
this database is compiled from the `UCD version 12.0.0 this database is compiled from the `UCD version 12.1.0
<http://www.unicode.org/Public/12.0.0/ucd>`_. <http://www.unicode.org/Public/12.1.0/ucd>`_.
The module uses the same names and symbols as defined by Unicode The module uses the same names and symbols as defined by Unicode
Standard Annex #44, `"Unicode Character Database" Standard Annex #44, `"Unicode Character Database"
@ -175,6 +175,6 @@ Examples:
.. rubric:: Footnotes .. rubric:: Footnotes
.. [#] http://www.unicode.org/Public/12.0.0/ucd/NameAliases.txt .. [#] http://www.unicode.org/Public/12.1.0/ucd/NameAliases.txt
.. [#] http://www.unicode.org/Public/12.0.0/ucd/NamedSequences.txt .. [#] http://www.unicode.org/Public/12.1.0/ucd/NamedSequences.txt

View file

@ -316,7 +316,7 @@ The Unicode category codes mentioned above stand for:
* *Nd* - decimal numbers * *Nd* - decimal numbers
* *Pc* - connector punctuations * *Pc* - connector punctuations
* *Other_ID_Start* - explicit list of characters in `PropList.txt * *Other_ID_Start* - explicit list of characters in `PropList.txt
<http://www.unicode.org/Public/12.0.0/ucd/PropList.txt>`_ to support backwards <http://www.unicode.org/Public/12.1.0/ucd/PropList.txt>`_ to support backwards
compatibility compatibility
* *Other_ID_Continue* - likewise * *Other_ID_Continue* - likewise

View file

@ -510,9 +510,8 @@ Added new clock :data:`~time.CLOCK_UPTIME_RAW` for macOS 10.12.
unicodedata unicodedata
----------- -----------
* The :mod:`unicodedata` module has been upgraded to use the `Unicode 12.0.0 * The :mod:`unicodedata` module has been upgraded to use the `Unicode 12.1.0
<http://blog.unicode.org/2019/03/announcing-unicode-standard-version-120.html>`_ <http://blog.unicode.org/2019/05/unicode-12-1-en.html>`_ release.
release.
* New function :func:`~unicodedata.is_normalized` can be used to verify a string * New function :func:`~unicodedata.is_normalized` can be used to verify a string
is in a specific normal form. (Contributed by Max Belanger and David Euresti in is in a specific normal form. (Contributed by Max Belanger and David Euresti in

View file

@ -80,7 +80,7 @@ class UnicodeFunctionsTest(UnicodeDatabaseTest):
# Update this if the database changes. Make sure to do a full rebuild # Update this if the database changes. Make sure to do a full rebuild
# (e.g. 'make distclean && make') to get the correct checksum. # (e.g. 'make distclean && make') to get the correct checksum.
expectedchecksum = '4cb02a243aed7c251067386dd738189146fddf94' expectedchecksum = 'c44a49ca7c5cb6441640fe174ede604b45028652'
def test_function_checksum(self): def test_function_checksum(self):
data = [] data = []
h = hashlib.sha1() h = hashlib.sha1()

View file

@ -0,0 +1 @@
Update the Unicode database to version 12.1.0.

1974
Modules/unicodedata_db.h generated

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -2925,7 +2925,7 @@ static const unsigned short index2[] = {
5, 5, 5, 5, 5, 5, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 5, 5, 5, 5, 5, 5, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
27, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 27, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 0, 55, 55, 55, 55, 55, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 55, 55, 55, 55, 55,
388, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 388, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55,
55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55,
55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55,

View file

@ -41,7 +41,7 @@ VERSION = "3.3"
# * Doc/library/stdtypes.rst, and # * Doc/library/stdtypes.rst, and
# * Doc/library/unicodedata.rst # * Doc/library/unicodedata.rst
# * Doc/reference/lexical_analysis.rst (two occurrences) # * Doc/reference/lexical_analysis.rst (two occurrences)
UNIDATA_VERSION = "12.0.0" UNIDATA_VERSION = "12.1.0"
UNICODE_DATA = "UnicodeData%s.txt" UNICODE_DATA = "UnicodeData%s.txt"
COMPOSITION_EXCLUSIONS = "CompositionExclusions%s.txt" COMPOSITION_EXCLUSIONS = "CompositionExclusions%s.txt"
EASTASIAN_WIDTH = "EastAsianWidth%s.txt" EASTASIAN_WIDTH = "EastAsianWidth%s.txt"