Commit graph

116 commits

Author SHA1 Message Date
Benjamin Peterson
279a96206f bpo-30736: upgrade to Unicode 10.0 (#2344)
Straightforward. While we're at it, though, strip trailing whitespace from generated tables.
2017-06-22 22:31:08 -07:00
Zachary Ware
6b6e687766 bpo-27425: Be more explicit in .gitattributes (GH-840)
Updates checked-in line endings on several files.
2017-06-10 14:58:42 -05:00
Jon Dufresne
3972628de3 bpo-30296 Remove unnecessary tuples, lists, sets, and dicts (#1489)
* Replaced list(<generator expression>) with list comprehension
* Replaced dict(<generator expression>) with dict comprehension
* Replaced set(<list literal>) with set literal
* Replaced builtin func(<list comprehension>) with func(<generator
  expression>) when supported (e.g. any(), all(), tuple(), min(), &
  max())
2017-05-18 07:35:54 -07:00
Benjamin Peterson
6775231597 Unicode 9.0.0
Not completely mechanical since support for East Asian Width changes—emoji
codepoints became Wide—had to be added to unicodedata.
2016-09-14 23:53:47 -07:00
R David Murray
44b548dda8 #27364: fix "incorrect" uses of escape character in the stdlib.
And most of the tools.

Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and
Martin Panter.
2016-09-08 13:59:53 -04:00
Benjamin Peterson
4801383c29 upgrade to Unicode 8.0.0 2015-06-27 15:45:56 -05:00
Serhiy Storchaka
ba9ac5b5c4 Issue #16261: Converted some bare except statements to except statements
with specified exception type.  Original patch by Ramchandra Apte.
2015-05-20 10:33:40 +03:00
Zachary Ware
774ac377da Closes #17202: Merge with 3.4 2015-04-13 12:11:40 -05:00
Zachary Ware
4c9c848159 Issue #17202: Add .bat to .hgeol to force them to CRLF.
Using LF can a script to fail if it tries to use a label that is
split across 512 byte blocks.  Who knows why.
2015-04-13 11:59:54 -05:00
Serhiy Storchaka
82e07b92b3 Issue #23181: More "codepoint" -> "code point". 2015-01-18 11:33:31 +02:00
Serhiy Storchaka
d3faf43f9b Issue #23181: More "codepoint" -> "code point". 2015-01-18 11:28:37 +02:00
R David Murray
2623a5db6f Merge: #18176: Change generic UCD PropList link to version specific link. 2014-10-09 20:47:31 -04:00
R David Murray
5f16f90d1b #18176: Change generic UCD PropList link to version specific link. 2014-10-09 20:45:59 -04:00
R David Murray
532783bd5e Merge: #18176: fix another reference and add it to the makeunicodedata comment. 2014-10-09 17:41:55 -04:00
R David Murray
5bd62420f4 #18176: fix another reference and add it to the makeunicodedata comment. 2014-10-09 17:39:48 -04:00
R David Murray
5ac125cde3 Merge: #18176: updated stdtypes UCD link, added reminder to makeunicodedata. 2014-10-09 17:33:15 -04:00
R David Murray
7445a383a6 #18176: updated stdtypes UCD link, added reminder to makeunicodedata.
Patch by Alexander Belopolsky.
2014-10-09 17:30:33 -04:00
Benjamin Peterson
3032ed7cb1 upgrade to unicode 7.0.0 2014-07-06 13:04:20 -07:00
Serhiy Storchaka
8f8ec92de8 Issue #19936: Added executable bits or shebang lines to Python scripts which
requires them.  Disable executable bits and shebang lines in test and
benchmark files in order to prevent using a random system python, and in
source files of modules which don't provide command line interface.  Fixed
shebang lines in the unittestgui and checkpip scripts.
2014-01-16 17:33:23 +02:00
Serhiy Storchaka
b992a0e102 Issue #19936: Added executable bits or shebang lines to Python scripts which
requires them.  Disable executable bits and shebang lines in test and
benchmark files in order to prevent using a random system python, and in
source files of modules which don't provide command line interface.  Fixed
shebang line to use python3 executable in the unittestgui script.
2014-01-16 17:15:49 +02:00
Andrew Kuchling
9d5c071060 #1097797: add the original mapping file 2013-11-10 21:46:02 -05:00
Andrew Kuchling
695f07b27b Fix some PEP8-formatting problems in the generated code 2013-11-10 21:45:24 -05:00
Benjamin Peterson
94d08d908b upgrade unicode db to 6.3.0 (closes #19221) 2013-10-10 17:24:45 -04:00
Ezio Melotti
d640fe2af5 #18803: merge with 3.3. 2013-08-26 01:33:30 +03:00
Ezio Melotti
7c4a7e6f3c #18803: fix more typos. Patch by Févry Thibault. 2013-08-26 01:32:56 +03:00
Antoine Pitrou
9ed5f27266 Issue #18722: Remove uses of the "register" keyword in C code. 2013-08-13 20:18:52 +02:00
Serhiy Storchaka
302b8c31ec Issue #15239: Make mkstringprep.py work again on Python 3. 2013-06-09 17:11:48 +03:00
Serhiy Storchaka
e7275ffa4c Issue #15239: Make mkstringprep.py work again on Python 3. 2013-06-09 17:08:00 +03:00
Antoine Pitrou
e9631e5d3a Issue #15378: Fix Tools/unicode/comparecodecs.py. Patch by Serhiy Storchaka. 2012-10-17 16:14:40 +02:00
Antoine Pitrou
31605ace0d Issue #15378: Fix Tools/unicode/comparecodecs.py. Patch by Serhiy Storchaka. 2012-10-17 16:13:55 +02:00
Antoine Pitrou
1eff0fc3cd Issue #15378: Fix Tools/unicode/comparecodecs.py. Patch by Serhiy Storchaka. 2012-10-17 16:12:30 +02:00
Benjamin Peterson
b8350f1c7d upgrade to UCD 6.2 2012-09-29 13:47:39 -04:00
Florent Xicluna
c20740109d Some cleanup in the Tools directory. 2012-07-07 17:03:54 +02:00
Antoine Pitrou
aaefac76dd Issue #14874: Restore charmap decoding speed to pre-PEP 393 levels.
Patch by Serhiy Storchaka.
2012-06-16 22:48:21 +02:00
Benjamin Peterson
71f660e00f update to Unicode 6.1 2012-02-20 22:24:29 -05:00
Benjamin Peterson
ad9c569825 delta encoding of upper/lower/title makes a glorious return (#12736) 2012-01-15 21:19:20 -05:00
Benjamin Peterson
d5890c8db5 add str.casefold() (closes #13752) 2012-01-14 13:23:30 -05:00
Benjamin Peterson
b2bf01d824 use full unicode mappings for upper/lower/title case (#12736)
Also broaden the category of characters that count as lowercase/uppercase.
2012-01-11 18:17:06 -05:00
Ezio Melotti
931b8aac80 #12753: Add support for Unicode name aliases and named sequences. 2011-10-21 21:57:36 +03:00
Ezio Melotti
a9860aeb08 #13054: fix usage of sys.maxunicode after PEP-393. 2011-10-04 19:06:00 +03:00
Ezio Melotti
2a1e926d63 Fix ResourceWarnings in makeunicodedata.py. 2011-09-30 08:46:25 +03:00
Ezio Melotti
3b3499ba69 #11565: Merge with 3.1. 2011-03-16 11:35:38 +02:00
Ezio Melotti
13925008dc #11565: Fix several typos. Patch by Piotr Kasprzyk. 2011-03-16 11:05:33 +02:00
Georg Brandl
49857f8a93 Add updated .hgeol file and fix newlines in the 3.2 branch. 2011-03-05 15:11:35 +01:00
Alexander Belopolsky
827fdaae30 Issue #10552: Partially fixed a sort error in Tools/unicode/gencodec.py 2010-11-30 16:56:15 +00:00
Martin v. Löwis
5cbc71e50a Issue #10459: Update CJK character names to Unicode 6.0. 2010-11-22 09:00:02 +00:00
Martin v. Löwis
baecd7243a Upgrade to Unicode 6.0.0.
makeunicodedata.py: download all data files from unicode.org,
  switch to extracting Unihan data from zip file.
  Read linebreakprops and derivednormalizationprops even for
  old versions, even though they are not used in delta records.
test:unicode.py: U+11000 is now assigned, use U+14000 instead.
2010-10-11 22:42:28 +00:00
Amaury Forgeot d'Arc
feb7307db4 #9210: remove --with-wctype-functions configure option.
The internal unicode database is now always used.

(after 5 years: see
  http://mail.python.org/pipermail/python-dev/2004-December/050193.html
)
2010-09-12 22:42:57 +00:00
Amaury Forgeot d'Arc
324ac65ceb #5127: Even on narrow unicode builds, the C functions that access the Unicode
Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).

The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
  now return the correct value for large code points
- repr() may consider more characters as printable.
2010-08-18 20:44:58 +00:00
Florent Xicluna
806d8cf0e8 Merged revisions 79494,79496 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r79494 | florent.xicluna | 2010-03-30 10:24:06 +0200 (mar, 30 mar 2010) | 2 lines

  #7643: Unicode codepoints VT (0x0B) and FF (0x0C) are linebreaks according to Unicode Standard Annex #14.
........
  r79496 | florent.xicluna | 2010-03-30 18:29:03 +0200 (mar, 30 mar 2010) | 2 lines

  Highlight the change of behavior related to r79494.  Now VT and FF are linebreaks.
........
2010-03-30 19:34:18 +00:00