mirror of
https://github.com/python/cpython.git
synced 2025-07-24 03:35:53 +00:00
Markup nits.
Make module references hyperlinks.
This commit is contained in:
parent
b7168c3a07
commit
4e28c593ad
2 changed files with 45 additions and 46 deletions
|
@ -1,7 +1,7 @@
|
|||
\section{\module{htmllib} ---
|
||||
A parser for HTML documents.}
|
||||
\declaremodule{standard}{htmllib}
|
||||
A parser for HTML documents}
|
||||
|
||||
\declaremodule{standard}{htmllib}
|
||||
\modulesynopsis{A parser for HTML documents.}
|
||||
|
||||
\index{HTML}
|
||||
|
@ -17,15 +17,13 @@ in string form via a method, and makes calls to methods of a
|
|||
other classes in order to add functionality, and allows most of its
|
||||
methods to be extended or overridden. In turn, this class is derived
|
||||
from and extends the \class{SGMLParser} class defined in module
|
||||
\module{sgmllib}\refstmodindex{sgmllib}. The \class{HTMLParser}
|
||||
\refmodule{sgmllib}\refstmodindex{sgmllib}. The \class{HTMLParser}
|
||||
implementation supports the HTML 2.0 language as described in
|
||||
\rfc{1866}. Two implementations of formatter objects are provided in
|
||||
the \module{formatter}\refstmodindex{formatter} module; refer to the
|
||||
the \refmodule{formatter}\refstmodindex{formatter} module; refer to the
|
||||
documentation for that module for information on the formatter
|
||||
interface.
|
||||
\index{SGML}
|
||||
\withsubitem{(in module sgmllib)}{\ttindex{SGMLParser}}
|
||||
\index{formatter}
|
||||
|
||||
The following is a summary of the interface defined by
|
||||
\class{sgmllib.SGMLParser}:
|
||||
|
@ -49,16 +47,16 @@ parser.close()
|
|||
|
||||
\item
|
||||
The interface to define semantics for HTML tags is very simple: derive
|
||||
a class and define methods called \code{start_\var{tag}()},
|
||||
\code{end_\var{tag}()}, or \code{do_\var{tag}()}. The parser will
|
||||
call these at appropriate moments: \code{start_\var{tag}} or
|
||||
\code{do_\var{tag}()} is called when an opening tag of the form
|
||||
\code{<\var{tag} ...>} is encountered; \code{end_\var{tag}()} is called
|
||||
a class and define methods called \method{start_\var{tag}()},
|
||||
\method{end_\var{tag}()}, or \method{do_\var{tag}()}. The parser will
|
||||
call these at appropriate moments: \method{start_\var{tag}} or
|
||||
\method{do_\var{tag}()} is called when an opening tag of the form
|
||||
\code{<\var{tag} ...>} is encountered; \method{end_\var{tag}()} is called
|
||||
when a closing tag of the form \code{<\var{tag}>} is encountered. If
|
||||
an opening tag requires a corresponding closing tag, like \code{<H1>}
|
||||
... \code{</H1>}, the class should define the \code{start_\var{tag}()}
|
||||
... \code{</H1>}, the class should define the \method{start_\var{tag}()}
|
||||
method; if a tag requires no closing tag, like \code{<P>}, the class
|
||||
should define the \code{do_\var{tag}()} method.
|
||||
should define the \method{do_\var{tag}()} method.
|
||||
|
||||
\end{itemize}
|
||||
|
||||
|
@ -90,8 +88,9 @@ affects the operation of \method{handle_data()} and \method{save_end()}.
|
|||
This method is called at the start of an anchor region. The arguments
|
||||
correspond to the attributes of the \code{<A>} tag with the same
|
||||
names. The default implementation maintains a list of hyperlinks
|
||||
(defined by the \code{href} attribute) within the document. The list
|
||||
of hyperlinks is available as the data attribute \code{anchorlist}.
|
||||
(defined by the \code{HREF} attribute for \code{<A>} tags) within the
|
||||
document. The list of hyperlinks is available as the data attribute
|
||||
\member{anchorlist}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{anchor_end}{}
|
||||
|
@ -115,7 +114,7 @@ nested.
|
|||
|
||||
\begin{methoddesc}{save_end}{}
|
||||
Ends buffering character data and returns all data saved since the
|
||||
preceeding call to \method{save_bgn()}. If the \code{nofill} flag is
|
||||
preceeding call to \method{save_bgn()}. If the \member{nofill} flag is
|
||||
false, whitespace is collapsed to single spaces. A call to this
|
||||
method without a preceeding call to \method{save_bgn()} will raise a
|
||||
\exception{TypeError} exception.
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
\section{\module{sgmllib} ---
|
||||
Simple SGML parser.}
|
||||
\declaremodule{standard}{sgmllib}
|
||||
Simple SGML parser}
|
||||
|
||||
\declaremodule{standard}{sgmllib}
|
||||
\modulesynopsis{Only as much of an SGML parser as needed to parse HTML.}
|
||||
|
||||
\index{SGML}
|
||||
|
@ -10,7 +10,7 @@ This module defines a class \class{SGMLParser} which serves as the
|
|||
basis for parsing text files formatted in SGML (Standard Generalized
|
||||
Mark-up Language). In fact, it does not provide a full SGML parser
|
||||
--- it only parses SGML insofar as it is used by HTML, and the module
|
||||
only exists as a base for the \module{htmllib}\refstmodindex{htmllib}
|
||||
only exists as a base for the \refmodule{htmllib}\refstmodindex{htmllib}
|
||||
module.
|
||||
|
||||
|
||||
|
@ -49,8 +49,8 @@ implicitly at instantiation time.
|
|||
|
||||
\begin{methoddesc}{setnomoretags}{}
|
||||
Stop processing tags. Treat all following input as literal input
|
||||
(CDATA). (This is only provided so the HTML tag \code{<PLAINTEXT>}
|
||||
can be implemented.)
|
||||
(CDATA). (This is only provided so the HTML tag
|
||||
\code{<PLAINTEXT>} can be implemented.)
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{setliteral}{}
|
||||
|
@ -72,15 +72,15 @@ redefined version should always call \method{close()}.
|
|||
|
||||
\begin{methoddesc}{handle_starttag}{tag, method, attributes}
|
||||
This method is called to handle start tags for which either a
|
||||
\code{start_\var{tag}()} or \code{do_\var{tag}()} method has been
|
||||
\method{start_\var{tag}()} or \method{do_\var{tag}()} method has been
|
||||
defined. The \var{tag} argument is the name of the tag converted to
|
||||
lower case, and the \var{method} argument is the bound method which
|
||||
should be used to support semantic interpretation of the start tag.
|
||||
The \var{attributes} argument is a list of \code{(\var{name}, \var{value})}
|
||||
pairs containing the attributes found inside the tag's \code{<>}
|
||||
brackets. The \var{name} has been translated to lower case and double
|
||||
quotes and backslashes in the \var{value} have been interpreted. For
|
||||
instance, for the tag \code{<A HREF="http://www.cwi.nl/">}, this
|
||||
The \var{attributes} argument is a list of \code{(\var{name},
|
||||
\var{value})} pairs containing the attributes found inside the tag's
|
||||
\code{<>} brackets. The \var{name} has been translated to lower case
|
||||
and double quotes and backslashes in the \var{value} have been interpreted.
|
||||
For instance, for the tag \code{<A HREF="http://www.cwi.nl/">}, this
|
||||
method would be called as \samp{unknown_starttag('a', [('href',
|
||||
'http://www.cwi.nl/')])}. The base implementation simply calls
|
||||
\var{method} with \var{attributes} as the only argument.
|
||||
|
@ -88,11 +88,11 @@ method would be called as \samp{unknown_starttag('a', [('href',
|
|||
|
||||
\begin{methoddesc}{handle_endtag}{tag, method}
|
||||
This method is called to handle endtags for which an
|
||||
\code{end_\var{tag}()} method has been defined. The \var{tag}
|
||||
argument is the name of the tag converted to lower case, and the
|
||||
\var{method} argument is the bound method which should be used to
|
||||
\method{end_\var{tag}()} method has been defined. The
|
||||
\var{tag} argument is the name of the tag converted to lower case, and
|
||||
the \var{method} argument is the bound method which should be used to
|
||||
support semantic interpretation of the end tag. If no
|
||||
\code{end_\var{tag}()} method is defined for the closing element,
|
||||
\method{end_\var{tag}()} method is defined for the closing element,
|
||||
this handler is not called. The base implementation simply calls
|
||||
\var{method}.
|
||||
\end{methoddesc}
|
||||
|
@ -120,12 +120,12 @@ This method is called to process a general entity reference of the
|
|||
form \samp{\&\var{ref};} where \var{ref} is an general entity
|
||||
reference. It looks for \var{ref} in the instance (or class)
|
||||
variable \member{entitydefs} which should be a mapping from entity
|
||||
names to corresponding translations.
|
||||
If a translation is found, it calls the method \method{handle_data()}
|
||||
with the translation; otherwise, it calls the method
|
||||
\code{unknown_entityref(\var{ref})}. The default \member{entitydefs}
|
||||
defines translations for \code{\&}, \code{\&apos}, \code{\>},
|
||||
\code{\<}, and \code{\"}.
|
||||
names to corresponding translations. If a translation is found, it
|
||||
calls the method \method{handle_data()} with the translation;
|
||||
otherwise, it calls the method \code{unknown_entityref(\var{ref})}.
|
||||
The default \member{entitydefs} defines translations for
|
||||
\code{\&}, \code{\&apos}, \code{\>}, \code{\<}, and
|
||||
\code{\"}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{handle_comment}{comment}
|
||||
|
@ -175,8 +175,8 @@ case:
|
|||
|
||||
\begin{methoddescni}{start_\var{tag}}{attributes}
|
||||
This method is called to process an opening tag \var{tag}. It has
|
||||
preference over \code{do_\var{tag}()}. The \var{attributes}
|
||||
argument has the same meaning as described for
|
||||
preference over \method{do_\var{tag}()}. The
|
||||
\var{attributes} argument has the same meaning as described for
|
||||
\method{handle_starttag()} above.
|
||||
\end{methoddescni}
|
||||
|
||||
|
@ -192,10 +192,10 @@ This method is called to process a closing tag \var{tag}.
|
|||
|
||||
Note that the parser maintains a stack of open elements for which no
|
||||
end tag has been found yet. Only tags processed by
|
||||
\code{start_\var{tag}()} are pushed on this stack. Definition of an
|
||||
\code{end_\var{tag}()} method is optional for these tags. For tags
|
||||
processed by \code{do_\var{tag}()} or by \method{unknown_tag()}, no
|
||||
\code{end_\var{tag}()} method must be defined; if defined, it will not
|
||||
be used. If both \code{start_\var{tag}()} and \code{do_\var{tag}()}
|
||||
methods exist for a tag, the \code{start_\var{tag}()} method takes
|
||||
precedence.
|
||||
\method{start_\var{tag}()} are pushed on this stack. Definition of an
|
||||
\method{end_\var{tag}()} method is optional for these tags. For tags
|
||||
processed by \method{do_\var{tag}()} or by \method{unknown_tag()}, no
|
||||
\method{end_\var{tag}()} method must be defined; if defined, it will
|
||||
not be used. If both \method{start_\var{tag}()} and
|
||||
\method{do_\var{tag}()} methods exist for a tag, the
|
||||
\method{start_\var{tag}()} method takes precedence.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue