mirror of
				https://github.com/python/cpython.git
				synced 2025-11-03 11:23:31 +00:00 
			
		
		
		
	svn+ssh://svn.python.org/python/branches/py3k ................ r74821 | georg.brandl | 2009-09-16 11:42:19 +0200 (Mi, 16 Sep 2009) | 1 line #6885: run python 3 as python3. ................ r74828 | georg.brandl | 2009-09-16 16:23:20 +0200 (Mi, 16 Sep 2009) | 1 line Use true booleans. ................ r74829 | georg.brandl | 2009-09-16 16:24:29 +0200 (Mi, 16 Sep 2009) | 1 line Small PEP8 correction. ................ r74830 | georg.brandl | 2009-09-16 16:36:22 +0200 (Mi, 16 Sep 2009) | 1 line Use true booleans. ................ r74831 | georg.brandl | 2009-09-16 17:54:04 +0200 (Mi, 16 Sep 2009) | 1 line Use true booleans and PEP8 for argdefaults. ................ r74833 | georg.brandl | 2009-09-16 17:58:14 +0200 (Mi, 16 Sep 2009) | 1 line Last round of adapting style of documenting argument default values. ................ r74835 | georg.brandl | 2009-09-16 18:00:31 +0200 (Mi, 16 Sep 2009) | 33 lines Merged revisions 74817-74820,74822-74824 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r74817 | georg.brandl | 2009-09-16 11:05:11 +0200 (Mi, 16 Sep 2009) | 1 line Make deprecation notices as visible as warnings are right now. ........ r74818 | georg.brandl | 2009-09-16 11:23:04 +0200 (Mi, 16 Sep 2009) | 1 line #6880: add reference to classes section in exceptions section, which comes earlier. ........ r74819 | georg.brandl | 2009-09-16 11:24:57 +0200 (Mi, 16 Sep 2009) | 1 line #6876: fix base class constructor invocation in example. ........ r74820 | georg.brandl | 2009-09-16 11:30:48 +0200 (Mi, 16 Sep 2009) | 1 line #6891: comment out dead link to Unicode article. ........ r74822 | georg.brandl | 2009-09-16 12:12:06 +0200 (Mi, 16 Sep 2009) | 1 line #5621: refactor description of how class/instance attributes interact on a.x=a.x+1 or augassign. ........ r74823 | georg.brandl | 2009-09-16 15:06:22 +0200 (Mi, 16 Sep 2009) | 1 line Remove strange trailing commas. ........ r74824 | georg.brandl | 2009-09-16 15:11:06 +0200 (Mi, 16 Sep 2009) | 1 line #6892: fix optparse example involving help option. ........ ................
		
			
				
	
	
		
			399 lines
		
	
	
	
		
			15 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			399 lines
		
	
	
	
		
			15 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
:mod:`xml.sax.handler` --- Base classes for SAX handlers
 | 
						|
========================================================
 | 
						|
 | 
						|
.. module:: xml.sax.handler
 | 
						|
   :synopsis: Base classes for SAX event handlers.
 | 
						|
.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
 | 
						|
.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
 | 
						|
 | 
						|
 | 
						|
The SAX API defines four kinds of handlers: content handlers, DTD handlers,
 | 
						|
error handlers, and entity resolvers. Applications normally only need to
 | 
						|
implement those interfaces whose events they are interested in; they can
 | 
						|
implement the interfaces in a single object or in multiple objects. Handler
 | 
						|
implementations should inherit from the base classes provided in the module
 | 
						|
:mod:`xml.sax.handler`, so that all methods get default implementations.
 | 
						|
 | 
						|
 | 
						|
.. class:: ContentHandler
 | 
						|
 | 
						|
   This is the main callback interface in SAX, and the one most important to
 | 
						|
   applications. The order of events in this interface mirrors the order of the
 | 
						|
   information in the document.
 | 
						|
 | 
						|
 | 
						|
.. class:: DTDHandler
 | 
						|
 | 
						|
   Handle DTD events.
 | 
						|
 | 
						|
   This interface specifies only those DTD events required for basic parsing
 | 
						|
   (unparsed entities and attributes).
 | 
						|
 | 
						|
 | 
						|
.. class:: EntityResolver
 | 
						|
 | 
						|
   Basic interface for resolving entities. If you create an object implementing
 | 
						|
   this interface, then register the object with your Parser, the parser will call
 | 
						|
   the method in your object to resolve all external entities.
 | 
						|
 | 
						|
 | 
						|
.. class:: ErrorHandler
 | 
						|
 | 
						|
   Interface used by the parser to present error and warning messages to the
 | 
						|
   application.  The methods of this object control whether errors are immediately
 | 
						|
   converted to exceptions or are handled in some other way.
 | 
						|
 | 
						|
In addition to these classes, :mod:`xml.sax.handler` provides symbolic constants
 | 
						|
for the feature and property names.
 | 
						|
 | 
						|
 | 
						|
.. data:: feature_namespaces
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/features/namespaces"`` ---  true: Perform Namespace
 | 
						|
   processing. ---  false: Optionally do not perform Namespace processing (implies
 | 
						|
   namespace-prefixes; default). ---  access: (parsing) read-only; (not parsing)
 | 
						|
   read/write
 | 
						|
 | 
						|
 | 
						|
.. data:: feature_namespace_prefixes
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/features/namespace-prefixes"`` --- true: Report
 | 
						|
   the original prefixed names and attributes used for Namespace
 | 
						|
   declarations. --- false: Do not report attributes used for Namespace
 | 
						|
   declarations, and optionally do not report original prefixed names
 | 
						|
   (default). --- access: (parsing) read-only; (not parsing) read/write
 | 
						|
 | 
						|
 | 
						|
.. data:: feature_string_interning
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/features/string-interning"`` ---  true: All element
 | 
						|
   names, prefixes, attribute names, Namespace URIs, and local names are interned
 | 
						|
   using the built-in intern function. ---  false: Names are not necessarily
 | 
						|
   interned, although they may be (default). ---  access: (parsing) read-only; (not
 | 
						|
   parsing) read/write
 | 
						|
 | 
						|
 | 
						|
.. data:: feature_validation
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/features/validation"`` --- true: Report all
 | 
						|
   validation errors (implies external-general-entities and
 | 
						|
   external-parameter-entities). --- false: Do not report validation errors. ---
 | 
						|
   access: (parsing) read-only; (not parsing) read/write
 | 
						|
 | 
						|
 | 
						|
.. data:: feature_external_ges
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/features/external-general-entities"`` ---  true:
 | 
						|
   Include all external general (text) entities. ---  false: Do not include
 | 
						|
   external general entities. ---  access: (parsing) read-only; (not parsing)
 | 
						|
   read/write
 | 
						|
 | 
						|
 | 
						|
.. data:: feature_external_pes
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/features/external-parameter-entities"`` ---  true:
 | 
						|
   Include all external parameter entities, including the external DTD subset. ---
 | 
						|
   false: Do not include any external parameter entities, even the external DTD
 | 
						|
   subset. ---  access: (parsing) read-only; (not parsing) read/write
 | 
						|
 | 
						|
 | 
						|
.. data:: all_features
 | 
						|
 | 
						|
   List of all features.
 | 
						|
 | 
						|
 | 
						|
.. data:: property_lexical_handler
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/properties/lexical-handler"`` ---  data type:
 | 
						|
   xml.sax.sax2lib.LexicalHandler (not supported in Python 2) ---  description: An
 | 
						|
   optional extension handler for lexical events like comments. ---  access:
 | 
						|
   read/write
 | 
						|
 | 
						|
 | 
						|
.. data:: property_declaration_handler
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/properties/declaration-handler"`` ---  data type:
 | 
						|
   xml.sax.sax2lib.DeclHandler (not supported in Python 2) ---  description: An
 | 
						|
   optional extension handler for DTD-related events other than notations and
 | 
						|
   unparsed entities. ---  access: read/write
 | 
						|
 | 
						|
 | 
						|
.. data:: property_dom_node
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/properties/dom-node"`` ---  data type:
 | 
						|
   org.w3c.dom.Node (not supported in Python 2)  ---  description: When parsing,
 | 
						|
   the current DOM node being visited if this is a DOM iterator; when not parsing,
 | 
						|
   the root DOM node for iteration. ---  access: (parsing) read-only; (not parsing)
 | 
						|
   read/write
 | 
						|
 | 
						|
 | 
						|
.. data:: property_xml_string
 | 
						|
 | 
						|
   Value: ``"http://xml.org/sax/properties/xml-string"`` ---  data type: String ---
 | 
						|
   description: The literal string of characters that was the source for the
 | 
						|
   current event. ---  access: read-only
 | 
						|
 | 
						|
 | 
						|
.. data:: all_properties
 | 
						|
 | 
						|
   List of all known property names.
 | 
						|
 | 
						|
 | 
						|
.. _content-handler-objects:
 | 
						|
 | 
						|
ContentHandler Objects
 | 
						|
----------------------
 | 
						|
 | 
						|
Users are expected to subclass :class:`ContentHandler` to support their
 | 
						|
application.  The following methods are called by the parser on the appropriate
 | 
						|
events in the input document:
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.setDocumentLocator(locator)
 | 
						|
 | 
						|
   Called by the parser to give the application a locator for locating the origin
 | 
						|
   of document events.
 | 
						|
 | 
						|
   SAX parsers are strongly encouraged (though not absolutely required) to supply a
 | 
						|
   locator: if it does so, it must supply the locator to the application by
 | 
						|
   invoking this method before invoking any of the other methods in the
 | 
						|
   DocumentHandler interface.
 | 
						|
 | 
						|
   The locator allows the application to determine the end position of any
 | 
						|
   document-related event, even if the parser is not reporting an error. Typically,
 | 
						|
   the application will use this information for reporting its own errors (such as
 | 
						|
   character content that does not match an application's business rules). The
 | 
						|
   information returned by the locator is probably not sufficient for use with a
 | 
						|
   search engine.
 | 
						|
 | 
						|
   Note that the locator will return correct information only during the invocation
 | 
						|
   of the events in this interface. The application should not attempt to use it at
 | 
						|
   any other time.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.startDocument()
 | 
						|
 | 
						|
   Receive notification of the beginning of a document.
 | 
						|
 | 
						|
   The SAX parser will invoke this method only once, before any other methods in
 | 
						|
   this interface or in DTDHandler (except for :meth:`setDocumentLocator`).
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.endDocument()
 | 
						|
 | 
						|
   Receive notification of the end of a document.
 | 
						|
 | 
						|
   The SAX parser will invoke this method only once, and it will be the last method
 | 
						|
   invoked during the parse. The parser shall not invoke this method until it has
 | 
						|
   either abandoned parsing (because of an unrecoverable error) or reached the end
 | 
						|
   of input.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.startPrefixMapping(prefix, uri)
 | 
						|
 | 
						|
   Begin the scope of a prefix-URI Namespace mapping.
 | 
						|
 | 
						|
   The information from this event is not necessary for normal Namespace
 | 
						|
   processing: the SAX XML reader will automatically replace prefixes for element
 | 
						|
   and attribute names when the ``feature_namespaces`` feature is enabled (the
 | 
						|
   default).
 | 
						|
 | 
						|
   There are cases, however, when applications need to use prefixes in character
 | 
						|
   data or in attribute values, where they cannot safely be expanded automatically;
 | 
						|
   the :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events supply the
 | 
						|
   information to the application to expand prefixes in those contexts itself, if
 | 
						|
   necessary.
 | 
						|
 | 
						|
   .. XXX This is not really the default, is it? MvL
 | 
						|
 | 
						|
   Note that :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events are not
 | 
						|
   guaranteed to be properly nested relative to each-other: all
 | 
						|
   :meth:`startPrefixMapping` events will occur before the corresponding
 | 
						|
   :meth:`startElement` event, and all :meth:`endPrefixMapping` events will occur
 | 
						|
   after the corresponding :meth:`endElement` event, but their order is not
 | 
						|
   guaranteed.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.endPrefixMapping(prefix)
 | 
						|
 | 
						|
   End the scope of a prefix-URI mapping.
 | 
						|
 | 
						|
   See :meth:`startPrefixMapping` for details. This event will always occur after
 | 
						|
   the corresponding :meth:`endElement` event, but the order of
 | 
						|
   :meth:`endPrefixMapping` events is not otherwise guaranteed.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.startElement(name, attrs)
 | 
						|
 | 
						|
   Signals the start of an element in non-namespace mode.
 | 
						|
 | 
						|
   The *name* parameter contains the raw XML 1.0 name of the element type as a
 | 
						|
   string and the *attrs* parameter holds an object of the :class:`Attributes`
 | 
						|
   interface (see :ref:`attributes-objects`) containing the attributes of
 | 
						|
   the element.  The object passed as *attrs* may be re-used by the parser; holding
 | 
						|
   on to a reference to it is not a reliable way to keep a copy of the attributes.
 | 
						|
   To keep a copy of the attributes, use the :meth:`copy` method of the *attrs*
 | 
						|
   object.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.endElement(name)
 | 
						|
 | 
						|
   Signals the end of an element in non-namespace mode.
 | 
						|
 | 
						|
   The *name* parameter contains the name of the element type, just as with the
 | 
						|
   :meth:`startElement` event.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.startElementNS(name, qname, attrs)
 | 
						|
 | 
						|
   Signals the start of an element in namespace mode.
 | 
						|
 | 
						|
   The *name* parameter contains the name of the element type as a ``(uri,
 | 
						|
   localname)`` tuple, the *qname* parameter contains the raw XML 1.0 name used in
 | 
						|
   the source document, and the *attrs* parameter holds an instance of the
 | 
						|
   :class:`AttributesNS` interface (see :ref:`attributes-ns-objects`)
 | 
						|
   containing the attributes of the element.  If no namespace is associated with
 | 
						|
   the element, the *uri* component of *name* will be ``None``.  The object passed
 | 
						|
   as *attrs* may be re-used by the parser; holding on to a reference to it is not
 | 
						|
   a reliable way to keep a copy of the attributes.  To keep a copy of the
 | 
						|
   attributes, use the :meth:`copy` method of the *attrs* object.
 | 
						|
 | 
						|
   Parsers may set the *qname* parameter to ``None``, unless the
 | 
						|
   ``feature_namespace_prefixes`` feature is activated.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.endElementNS(name, qname)
 | 
						|
 | 
						|
   Signals the end of an element in namespace mode.
 | 
						|
 | 
						|
   The *name* parameter contains the name of the element type, just as with the
 | 
						|
   :meth:`startElementNS` method, likewise the *qname* parameter.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.characters(content)
 | 
						|
 | 
						|
   Receive notification of character data.
 | 
						|
 | 
						|
   The Parser will call this method to report each chunk of character data. SAX
 | 
						|
   parsers may return all contiguous character data in a single chunk, or they may
 | 
						|
   split it into several chunks; however, all of the characters in any single event
 | 
						|
   must come from the same external entity so that the Locator provides useful
 | 
						|
   information.
 | 
						|
 | 
						|
   *content* may be a string or bytes instance; the ``expat`` reader module
 | 
						|
   always produces strings.
 | 
						|
 | 
						|
   .. note::
 | 
						|
 | 
						|
      The earlier SAX 1 interface provided by the Python XML Special Interest Group
 | 
						|
      used a more Java-like interface for this method.  Since most parsers used from
 | 
						|
      Python did not take advantage of the older interface, the simpler signature was
 | 
						|
      chosen to replace it.  To convert old code to the new interface, use *content*
 | 
						|
      instead of slicing content with the old *offset* and *length* parameters.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.ignorableWhitespace(whitespace)
 | 
						|
 | 
						|
   Receive notification of ignorable whitespace in element content.
 | 
						|
 | 
						|
   Validating Parsers must use this method to report each chunk of ignorable
 | 
						|
   whitespace (see the W3C XML 1.0 recommendation, section 2.10): non-validating
 | 
						|
   parsers may also use this method if they are capable of parsing and using
 | 
						|
   content models.
 | 
						|
 | 
						|
   SAX parsers may return all contiguous whitespace in a single chunk, or they may
 | 
						|
   split it into several chunks; however, all of the characters in any single event
 | 
						|
   must come from the same external entity, so that the Locator provides useful
 | 
						|
   information.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.processingInstruction(target, data)
 | 
						|
 | 
						|
   Receive notification of a processing instruction.
 | 
						|
 | 
						|
   The Parser will invoke this method once for each processing instruction found:
 | 
						|
   note that processing instructions may occur before or after the main document
 | 
						|
   element.
 | 
						|
 | 
						|
   A SAX parser should never report an XML declaration (XML 1.0, section 2.8) or a
 | 
						|
   text declaration (XML 1.0, section 4.3.1) using this method.
 | 
						|
 | 
						|
 | 
						|
.. method:: ContentHandler.skippedEntity(name)
 | 
						|
 | 
						|
   Receive notification of a skipped entity.
 | 
						|
 | 
						|
   The Parser will invoke this method once for each entity skipped. Non-validating
 | 
						|
   processors may skip entities if they have not seen the declarations (because,
 | 
						|
   for example, the entity was declared in an external DTD subset). All processors
 | 
						|
   may skip external entities, depending on the values of the
 | 
						|
   ``feature_external_ges`` and the ``feature_external_pes`` properties.
 | 
						|
 | 
						|
 | 
						|
.. _dtd-handler-objects:
 | 
						|
 | 
						|
DTDHandler Objects
 | 
						|
------------------
 | 
						|
 | 
						|
:class:`DTDHandler` instances provide the following methods:
 | 
						|
 | 
						|
 | 
						|
.. method:: DTDHandler.notationDecl(name, publicId, systemId)
 | 
						|
 | 
						|
   Handle a notation declaration event.
 | 
						|
 | 
						|
 | 
						|
.. method:: DTDHandler.unparsedEntityDecl(name, publicId, systemId, ndata)
 | 
						|
 | 
						|
   Handle an unparsed entity declaration event.
 | 
						|
 | 
						|
 | 
						|
.. _entity-resolver-objects:
 | 
						|
 | 
						|
EntityResolver Objects
 | 
						|
----------------------
 | 
						|
 | 
						|
 | 
						|
.. method:: EntityResolver.resolveEntity(publicId, systemId)
 | 
						|
 | 
						|
   Resolve the system identifier of an entity and return either the system
 | 
						|
   identifier to read from as a string, or an InputSource to read from. The default
 | 
						|
   implementation returns *systemId*.
 | 
						|
 | 
						|
 | 
						|
.. _sax-error-handler:
 | 
						|
 | 
						|
ErrorHandler Objects
 | 
						|
--------------------
 | 
						|
 | 
						|
Objects with this interface are used to receive error and warning information
 | 
						|
from the :class:`XMLReader`.  If you create an object that implements this
 | 
						|
interface, then register the object with your :class:`XMLReader`, the parser
 | 
						|
will call the methods in your object to report all warnings and errors. There
 | 
						|
are three levels of errors available: warnings, (possibly) recoverable errors,
 | 
						|
and unrecoverable errors.  All methods take a :exc:`SAXParseException` as the
 | 
						|
only parameter.  Errors and warnings may be converted to an exception by raising
 | 
						|
the passed-in exception object.
 | 
						|
 | 
						|
 | 
						|
.. method:: ErrorHandler.error(exception)
 | 
						|
 | 
						|
   Called when the parser encounters a recoverable error.  If this method does not
 | 
						|
   raise an exception, parsing may continue, but further document information
 | 
						|
   should not be expected by the application.  Allowing the parser to continue may
 | 
						|
   allow additional errors to be discovered in the input document.
 | 
						|
 | 
						|
 | 
						|
.. method:: ErrorHandler.fatalError(exception)
 | 
						|
 | 
						|
   Called when the parser encounters an error it cannot recover from; parsing is
 | 
						|
   expected to terminate when this method returns.
 | 
						|
 | 
						|
 | 
						|
.. method:: ErrorHandler.warning(exception)
 | 
						|
 | 
						|
   Called when the parser presents minor warning information to the application.
 | 
						|
   Parsing is expected to continue when this method returns, and document
 | 
						|
   information will continue to be passed to the application. Raising an exception
 | 
						|
   in this method will cause parsing to end.
 | 
						|
 |