mirror of
https://github.com/python/cpython.git
synced 2025-11-02 03:01:58 +00:00
document the exceptions raised by sgmllib, htmllib, and HTMLParser
This commit is contained in:
parent
a2544ee7f0
commit
961c2882a9
3 changed files with 34 additions and 6 deletions
|
|
@ -35,8 +35,8 @@ The interface to feed data to an instance is through the \method{feed()}
|
|||
method, which takes a string argument. This can be called with as
|
||||
little or as much text at a time as desired; \samp{p.feed(a);
|
||||
p.feed(b)} has the same effect as \samp{p.feed(a+b)}. When the data
|
||||
contains complete HTML tags, these are processed immediately;
|
||||
incomplete elements are saved in a buffer. To force processing of all
|
||||
contains complete HTML markup constructs, these are processed immediately;
|
||||
incomplete constructs are saved in a buffer. To force processing of all
|
||||
unprocessed data, call the \method{close()} method.
|
||||
|
||||
For example, to parse the entire contents of a file, use:
|
||||
|
|
@ -60,7 +60,7 @@ should define the \method{do_\var{tag}()} method.
|
|||
|
||||
\end{itemize}
|
||||
|
||||
The module defines a single class:
|
||||
The module defines a parser class and an exception:
|
||||
|
||||
\begin{classdesc}{HTMLParser}{formatter}
|
||||
This is the basic HTML parser class. It supports all entity names
|
||||
|
|
@ -68,6 +68,12 @@ required by the XHTML 1.0 Recommendation (\url{http://www.w3.org/TR/xhtml1}).
|
|||
It also defines handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements.
|
||||
\end{classdesc}
|
||||
|
||||
\begin{excdesc}{HTMLParseError}
|
||||
Exception raised by the \class{HTMLParser} class when it encounters an
|
||||
error while parsing.
|
||||
\versionadded{2.4}
|
||||
\end{excdesc}
|
||||
|
||||
|
||||
\begin{seealso}
|
||||
\seemodule{formatter}{Interface definition for transforming an
|
||||
|
|
@ -118,7 +124,8 @@ implementation adds a textual footnote marker using an index into the
|
|||
list of hyperlinks created by \method{anchor_bgn()}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{handle_image}{source, alt\optional{, ismap\optional{, align\optional{, width\optional{, height}}}}}
|
||||
\begin{methoddesc}{handle_image}{source, alt\optional{, ismap\optional{,
|
||||
align\optional{, width\optional{, height}}}}}
|
||||
This method is called to handle images. The default implementation
|
||||
simply passes the \var{alt} value to the \method{handle_data()}
|
||||
method.
|
||||
|
|
|
|||
|
|
@ -4,6 +4,8 @@
|
|||
\declaremodule{standard}{HTMLParser}
|
||||
\modulesynopsis{A simple parser that can handle HTML and XHTML.}
|
||||
|
||||
\versionadded{2.2}
|
||||
|
||||
This module defines a class \class{HTMLParser} which serves as the
|
||||
basis for parsing text files formatted in HTML\index{HTML} (HyperText
|
||||
Mark-up Language) and XHTML.\index{XHTML} Unlike the parser in
|
||||
|
|
@ -23,6 +25,17 @@ that end tags match start tags or call the end-tag handler for
|
|||
elements which are closed implicitly by closing an outer element.
|
||||
\end{classdesc}
|
||||
|
||||
An exception is defined as well:
|
||||
|
||||
\begin{excdesc}{HTMLParseError}
|
||||
Exception raised by the \class{HTMLParser} class when it encounters an
|
||||
error while parsing. This exception provides three attributes:
|
||||
\member{msg} is a brief message explaining the error, \member{lineno}
|
||||
is the number of the line on which the broken construct was detected,
|
||||
and \member{offset} is the number of characters into the line at which
|
||||
the construct starts.
|
||||
\end{excdesc}
|
||||
|
||||
|
||||
\class{HTMLParser} instances have the following methods:
|
||||
|
||||
|
|
|
|||
|
|
@ -14,7 +14,6 @@ only exists as a base for the \refmodule{htmllib} module. Another
|
|||
HTML parser which supports XHTML and offers a somewhat different
|
||||
interface is available in the \refmodule{HTMLParser} module.
|
||||
|
||||
|
||||
\begin{classdesc}{SGMLParser}{}
|
||||
The \class{SGMLParser} class is instantiated without arguments.
|
||||
The parser is hardcoded to recognize the following
|
||||
|
|
@ -40,7 +39,16 @@ spaces, tabs, and newlines are allowed between the trailing
|
|||
\end{itemize}
|
||||
\end{classdesc}
|
||||
|
||||
\class{SGMLParser} instances have the following interface methods:
|
||||
A single exception is defined as well:
|
||||
|
||||
\begin{excdesc}{SGMLParseError}
|
||||
Exception raised by the \class{SGMLParser} class when it encounters an
|
||||
error while parsing.
|
||||
\versionadded{2.1}
|
||||
\end{excdesc}
|
||||
|
||||
|
||||
\class{SGMLParser} instances have the following methods:
|
||||
|
||||
|
||||
\begin{methoddesc}{reset}{}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue