mirror of
https://github.com/python/cpython.git
synced 2025-07-07 19:35:27 +00:00
gh-131535: Fix stale example in html.parser docs, make examples doctests (GH-131551)
This commit is contained in:
parent
77b14a6d58
commit
ee76e36d76
1 changed files with 37 additions and 14 deletions
|
@ -43,7 +43,9 @@ Example HTML Parser Application
|
|||
|
||||
As a basic example, below is a simple HTML parser that uses the
|
||||
:class:`HTMLParser` class to print out start tags, end tags, and data
|
||||
as they are encountered::
|
||||
as they are encountered:
|
||||
|
||||
.. testcode::
|
||||
|
||||
from html.parser import HTMLParser
|
||||
|
||||
|
@ -63,7 +65,7 @@ as they are encountered::
|
|||
|
||||
The output will then be:
|
||||
|
||||
.. code-block:: none
|
||||
.. testoutput::
|
||||
|
||||
Encountered a start tag: html
|
||||
Encountered a start tag: head
|
||||
|
@ -230,7 +232,9 @@ Examples
|
|||
--------
|
||||
|
||||
The following class implements a parser that will be used to illustrate more
|
||||
examples::
|
||||
examples:
|
||||
|
||||
.. testcode::
|
||||
|
||||
from html.parser import HTMLParser
|
||||
from html.entities import name2codepoint
|
||||
|
@ -266,13 +270,17 @@ examples::
|
|||
|
||||
parser = MyHTMLParser()
|
||||
|
||||
Parsing a doctype::
|
||||
Parsing a doctype:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> parser.feed('<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" '
|
||||
... '"http://www.w3.org/TR/html4/strict.dtd">')
|
||||
Decl : DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"
|
||||
|
||||
Parsing an element with a few attributes and a title::
|
||||
Parsing an element with a few attributes and a title:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> parser.feed('<img src="python-logo.png" alt="The Python logo">')
|
||||
Start tag: img
|
||||
|
@ -285,7 +293,9 @@ Parsing an element with a few attributes and a title::
|
|||
End tag : h1
|
||||
|
||||
The content of ``script`` and ``style`` elements is returned as is, without
|
||||
further parsing::
|
||||
further parsing:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> parser.feed('<style type="text/css">#python { color: green }</style>')
|
||||
Start tag: style
|
||||
|
@ -300,16 +310,25 @@ further parsing::
|
|||
Data : alert("<strong>hello!</strong>");
|
||||
End tag : script
|
||||
|
||||
Parsing comments::
|
||||
Parsing comments:
|
||||
|
||||
>>> parser.feed('<!-- a comment -->'
|
||||
.. doctest::
|
||||
|
||||
>>> parser.feed('<!--a comment-->'
|
||||
... '<!--[if IE 9]>IE-specific content<![endif]-->')
|
||||
Comment : a comment
|
||||
Comment : a comment
|
||||
Comment : [if IE 9]>IE-specific content<![endif]
|
||||
|
||||
Parsing named and numeric character references and converting them to the
|
||||
correct char (note: these 3 references are all equivalent to ``'>'``)::
|
||||
correct char (note: these 3 references are all equivalent to ``'>'``):
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> parser = MyHTMLParser()
|
||||
>>> parser.feed('>>>')
|
||||
Data : >>>
|
||||
|
||||
>>> parser = MyHTMLParser(convert_charrefs=False)
|
||||
>>> parser.feed('>>>')
|
||||
Named ent: >
|
||||
Num ent : >
|
||||
|
@ -317,18 +336,22 @@ correct char (note: these 3 references are all equivalent to ``'>'``)::
|
|||
|
||||
Feeding incomplete chunks to :meth:`~HTMLParser.feed` works, but
|
||||
:meth:`~HTMLParser.handle_data` might be called more than once
|
||||
(unless *convert_charrefs* is set to ``True``)::
|
||||
(unless *convert_charrefs* is set to ``True``):
|
||||
|
||||
>>> for chunk in ['<sp', 'an>buff', 'ered ', 'text</s', 'pan>']:
|
||||
.. doctest::
|
||||
|
||||
>>> for chunk in ['<sp', 'an>buff', 'ered', ' text</s', 'pan>']:
|
||||
... parser.feed(chunk)
|
||||
...
|
||||
Start tag: span
|
||||
Data : buff
|
||||
Data : ered
|
||||
Data : text
|
||||
Data : text
|
||||
End tag : span
|
||||
|
||||
Parsing invalid HTML (e.g. unquoted attributes) also works::
|
||||
Parsing invalid HTML (e.g. unquoted attributes) also works:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> parser.feed('<p><a class=link href=#main>tag soup</p ></a>')
|
||||
Start tag: p
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue