cpython/Lib/html
Serhiy Storchaka c555f889c3
Some checks failed
Tests / Change detection (push) Has been cancelled
Lint / lint (push) Has been cancelled
Tests / (push) Has been cancelled
Tests / Docs (push) Has been cancelled
Tests / Check if the ABI has changed (push) Has been cancelled
Tests / All required checks pass (push) Has been cancelled
Tests / Check if Autoconf files are up to date (push) Has been cancelled
Tests / Check if generated files are up to date (push) Has been cancelled
Tests / Windows MSI (push) Has been cancelled
Tests / Ubuntu SSL tests with OpenSSL (push) Has been cancelled
Tests / Hypothesis tests on Ubuntu (push) Has been cancelled
Tests / Address sanitizer (push) Has been cancelled
[3.12] gh-135661: Fix parsing start and end tags in HTMLParser according to the HTML5 standard (GH-135930) (GH-136268)
* Whitespaces no longer accepted between `</` and the tag name.
  E.g. `</ script>` does not end the script section.

* Vertical tabulation (`\v`) and non-ASCII whitespaces no longer recognized
  as whitespaces. The only whitespaces are `\t\n\r\f `.

* Null character (U+0000) no longer ends the tag name.

* Attributes and slashes after the tag name in end tags are now ignored,
  instead of terminating after the first `>` in quoted attribute value.
  E.g. `</script/foo=">"/>`.

* Multiple slashes and whitespaces between the last attribute and closing `>`
  are now ignored in both start and end tags. E.g. `<a foo=bar/ //>`.

* Multiple `=` between attribute name and value are no longer collapsed.
  E.g. `<a foo==bar>` produces attribute "foo" with value "=bar".

* Whitespaces between the `=` separator and attribute name or value are no
  longer ignored. E.g. `<a foo =bar>` produces two attributes "foo" and
  "=bar", both with value None; `<a foo= bar>` produces two attributes:
  "foo" with value "" and "bar" with value None.

* Fix data loss after unclosed script or style tag (gh-86155).

Also backport test.support.subTests() (gh-135120).

---------
(cherry picked from commit 0243f97cba)

Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
Co-authored-by: Waylan Limberg <waylan.limberg@icloud.com>
2025-07-04 17:28:00 +02:00
..
__init__.py gh-100210: Correct the comment link for unescaping HTML (#100212) 2023-02-19 11:18:12 +01:00
entities.py gh-97669: Create Tools/build/ directory (#97963) 2022-10-17 12:01:00 +02:00
parser.py [3.12] gh-135661: Fix parsing start and end tags in HTMLParser according to the HTML5 standard (GH-135930) (GH-136268) 2025-07-04 17:28:00 +02:00