mirror of
				https://github.com/python/cpython.git
				synced 2025-10-24 23:46:23 +00:00 
			
		
		
		
	 b044b2a701
			
		
	
	
		b044b2a701
		
	
	
	
	
		
			
			svn+ssh://svn.python.org/python/branches/py3k ................ r74821 | georg.brandl | 2009-09-16 11:42:19 +0200 (Mi, 16 Sep 2009) | 1 line #6885: run python 3 as python3. ................ r74828 | georg.brandl | 2009-09-16 16:23:20 +0200 (Mi, 16 Sep 2009) | 1 line Use true booleans. ................ r74829 | georg.brandl | 2009-09-16 16:24:29 +0200 (Mi, 16 Sep 2009) | 1 line Small PEP8 correction. ................ r74830 | georg.brandl | 2009-09-16 16:36:22 +0200 (Mi, 16 Sep 2009) | 1 line Use true booleans. ................ r74831 | georg.brandl | 2009-09-16 17:54:04 +0200 (Mi, 16 Sep 2009) | 1 line Use true booleans and PEP8 for argdefaults. ................ r74833 | georg.brandl | 2009-09-16 17:58:14 +0200 (Mi, 16 Sep 2009) | 1 line Last round of adapting style of documenting argument default values. ................ r74835 | georg.brandl | 2009-09-16 18:00:31 +0200 (Mi, 16 Sep 2009) | 33 lines Merged revisions 74817-74820,74822-74824 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r74817 | georg.brandl | 2009-09-16 11:05:11 +0200 (Mi, 16 Sep 2009) | 1 line Make deprecation notices as visible as warnings are right now. ........ r74818 | georg.brandl | 2009-09-16 11:23:04 +0200 (Mi, 16 Sep 2009) | 1 line #6880: add reference to classes section in exceptions section, which comes earlier. ........ r74819 | georg.brandl | 2009-09-16 11:24:57 +0200 (Mi, 16 Sep 2009) | 1 line #6876: fix base class constructor invocation in example. ........ r74820 | georg.brandl | 2009-09-16 11:30:48 +0200 (Mi, 16 Sep 2009) | 1 line #6891: comment out dead link to Unicode article. ........ r74822 | georg.brandl | 2009-09-16 12:12:06 +0200 (Mi, 16 Sep 2009) | 1 line #5621: refactor description of how class/instance attributes interact on a.x=a.x+1 or augassign. ........ r74823 | georg.brandl | 2009-09-16 15:06:22 +0200 (Mi, 16 Sep 2009) | 1 line Remove strange trailing commas. ........ r74824 | georg.brandl | 2009-09-16 15:11:06 +0200 (Mi, 16 Sep 2009) | 1 line #6892: fix optparse example involving help option. ........ ................
		
			
				
	
	
		
			399 lines
		
	
	
	
		
			15 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			399 lines
		
	
	
	
		
			15 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| :mod:`xml.sax.handler` --- Base classes for SAX handlers
 | |
| ========================================================
 | |
| 
 | |
| .. module:: xml.sax.handler
 | |
|    :synopsis: Base classes for SAX event handlers.
 | |
| .. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
 | |
| .. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
 | |
| 
 | |
| 
 | |
| The SAX API defines four kinds of handlers: content handlers, DTD handlers,
 | |
| error handlers, and entity resolvers. Applications normally only need to
 | |
| implement those interfaces whose events they are interested in; they can
 | |
| implement the interfaces in a single object or in multiple objects. Handler
 | |
| implementations should inherit from the base classes provided in the module
 | |
| :mod:`xml.sax.handler`, so that all methods get default implementations.
 | |
| 
 | |
| 
 | |
| .. class:: ContentHandler
 | |
| 
 | |
|    This is the main callback interface in SAX, and the one most important to
 | |
|    applications. The order of events in this interface mirrors the order of the
 | |
|    information in the document.
 | |
| 
 | |
| 
 | |
| .. class:: DTDHandler
 | |
| 
 | |
|    Handle DTD events.
 | |
| 
 | |
|    This interface specifies only those DTD events required for basic parsing
 | |
|    (unparsed entities and attributes).
 | |
| 
 | |
| 
 | |
| .. class:: EntityResolver
 | |
| 
 | |
|    Basic interface for resolving entities. If you create an object implementing
 | |
|    this interface, then register the object with your Parser, the parser will call
 | |
|    the method in your object to resolve all external entities.
 | |
| 
 | |
| 
 | |
| .. class:: ErrorHandler
 | |
| 
 | |
|    Interface used by the parser to present error and warning messages to the
 | |
|    application.  The methods of this object control whether errors are immediately
 | |
|    converted to exceptions or are handled in some other way.
 | |
| 
 | |
| In addition to these classes, :mod:`xml.sax.handler` provides symbolic constants
 | |
| for the feature and property names.
 | |
| 
 | |
| 
 | |
| .. data:: feature_namespaces
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/features/namespaces"`` ---  true: Perform Namespace
 | |
|    processing. ---  false: Optionally do not perform Namespace processing (implies
 | |
|    namespace-prefixes; default). ---  access: (parsing) read-only; (not parsing)
 | |
|    read/write
 | |
| 
 | |
| 
 | |
| .. data:: feature_namespace_prefixes
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/features/namespace-prefixes"`` --- true: Report
 | |
|    the original prefixed names and attributes used for Namespace
 | |
|    declarations. --- false: Do not report attributes used for Namespace
 | |
|    declarations, and optionally do not report original prefixed names
 | |
|    (default). --- access: (parsing) read-only; (not parsing) read/write
 | |
| 
 | |
| 
 | |
| .. data:: feature_string_interning
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/features/string-interning"`` ---  true: All element
 | |
|    names, prefixes, attribute names, Namespace URIs, and local names are interned
 | |
|    using the built-in intern function. ---  false: Names are not necessarily
 | |
|    interned, although they may be (default). ---  access: (parsing) read-only; (not
 | |
|    parsing) read/write
 | |
| 
 | |
| 
 | |
| .. data:: feature_validation
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/features/validation"`` --- true: Report all
 | |
|    validation errors (implies external-general-entities and
 | |
|    external-parameter-entities). --- false: Do not report validation errors. ---
 | |
|    access: (parsing) read-only; (not parsing) read/write
 | |
| 
 | |
| 
 | |
| .. data:: feature_external_ges
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/features/external-general-entities"`` ---  true:
 | |
|    Include all external general (text) entities. ---  false: Do not include
 | |
|    external general entities. ---  access: (parsing) read-only; (not parsing)
 | |
|    read/write
 | |
| 
 | |
| 
 | |
| .. data:: feature_external_pes
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/features/external-parameter-entities"`` ---  true:
 | |
|    Include all external parameter entities, including the external DTD subset. ---
 | |
|    false: Do not include any external parameter entities, even the external DTD
 | |
|    subset. ---  access: (parsing) read-only; (not parsing) read/write
 | |
| 
 | |
| 
 | |
| .. data:: all_features
 | |
| 
 | |
|    List of all features.
 | |
| 
 | |
| 
 | |
| .. data:: property_lexical_handler
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/properties/lexical-handler"`` ---  data type:
 | |
|    xml.sax.sax2lib.LexicalHandler (not supported in Python 2) ---  description: An
 | |
|    optional extension handler for lexical events like comments. ---  access:
 | |
|    read/write
 | |
| 
 | |
| 
 | |
| .. data:: property_declaration_handler
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/properties/declaration-handler"`` ---  data type:
 | |
|    xml.sax.sax2lib.DeclHandler (not supported in Python 2) ---  description: An
 | |
|    optional extension handler for DTD-related events other than notations and
 | |
|    unparsed entities. ---  access: read/write
 | |
| 
 | |
| 
 | |
| .. data:: property_dom_node
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/properties/dom-node"`` ---  data type:
 | |
|    org.w3c.dom.Node (not supported in Python 2)  ---  description: When parsing,
 | |
|    the current DOM node being visited if this is a DOM iterator; when not parsing,
 | |
|    the root DOM node for iteration. ---  access: (parsing) read-only; (not parsing)
 | |
|    read/write
 | |
| 
 | |
| 
 | |
| .. data:: property_xml_string
 | |
| 
 | |
|    Value: ``"http://xml.org/sax/properties/xml-string"`` ---  data type: String ---
 | |
|    description: The literal string of characters that was the source for the
 | |
|    current event. ---  access: read-only
 | |
| 
 | |
| 
 | |
| .. data:: all_properties
 | |
| 
 | |
|    List of all known property names.
 | |
| 
 | |
| 
 | |
| .. _content-handler-objects:
 | |
| 
 | |
| ContentHandler Objects
 | |
| ----------------------
 | |
| 
 | |
| Users are expected to subclass :class:`ContentHandler` to support their
 | |
| application.  The following methods are called by the parser on the appropriate
 | |
| events in the input document:
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.setDocumentLocator(locator)
 | |
| 
 | |
|    Called by the parser to give the application a locator for locating the origin
 | |
|    of document events.
 | |
| 
 | |
|    SAX parsers are strongly encouraged (though not absolutely required) to supply a
 | |
|    locator: if it does so, it must supply the locator to the application by
 | |
|    invoking this method before invoking any of the other methods in the
 | |
|    DocumentHandler interface.
 | |
| 
 | |
|    The locator allows the application to determine the end position of any
 | |
|    document-related event, even if the parser is not reporting an error. Typically,
 | |
|    the application will use this information for reporting its own errors (such as
 | |
|    character content that does not match an application's business rules). The
 | |
|    information returned by the locator is probably not sufficient for use with a
 | |
|    search engine.
 | |
| 
 | |
|    Note that the locator will return correct information only during the invocation
 | |
|    of the events in this interface. The application should not attempt to use it at
 | |
|    any other time.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.startDocument()
 | |
| 
 | |
|    Receive notification of the beginning of a document.
 | |
| 
 | |
|    The SAX parser will invoke this method only once, before any other methods in
 | |
|    this interface or in DTDHandler (except for :meth:`setDocumentLocator`).
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.endDocument()
 | |
| 
 | |
|    Receive notification of the end of a document.
 | |
| 
 | |
|    The SAX parser will invoke this method only once, and it will be the last method
 | |
|    invoked during the parse. The parser shall not invoke this method until it has
 | |
|    either abandoned parsing (because of an unrecoverable error) or reached the end
 | |
|    of input.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.startPrefixMapping(prefix, uri)
 | |
| 
 | |
|    Begin the scope of a prefix-URI Namespace mapping.
 | |
| 
 | |
|    The information from this event is not necessary for normal Namespace
 | |
|    processing: the SAX XML reader will automatically replace prefixes for element
 | |
|    and attribute names when the ``feature_namespaces`` feature is enabled (the
 | |
|    default).
 | |
| 
 | |
|    There are cases, however, when applications need to use prefixes in character
 | |
|    data or in attribute values, where they cannot safely be expanded automatically;
 | |
|    the :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events supply the
 | |
|    information to the application to expand prefixes in those contexts itself, if
 | |
|    necessary.
 | |
| 
 | |
|    .. XXX This is not really the default, is it? MvL
 | |
| 
 | |
|    Note that :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events are not
 | |
|    guaranteed to be properly nested relative to each-other: all
 | |
|    :meth:`startPrefixMapping` events will occur before the corresponding
 | |
|    :meth:`startElement` event, and all :meth:`endPrefixMapping` events will occur
 | |
|    after the corresponding :meth:`endElement` event, but their order is not
 | |
|    guaranteed.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.endPrefixMapping(prefix)
 | |
| 
 | |
|    End the scope of a prefix-URI mapping.
 | |
| 
 | |
|    See :meth:`startPrefixMapping` for details. This event will always occur after
 | |
|    the corresponding :meth:`endElement` event, but the order of
 | |
|    :meth:`endPrefixMapping` events is not otherwise guaranteed.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.startElement(name, attrs)
 | |
| 
 | |
|    Signals the start of an element in non-namespace mode.
 | |
| 
 | |
|    The *name* parameter contains the raw XML 1.0 name of the element type as a
 | |
|    string and the *attrs* parameter holds an object of the :class:`Attributes`
 | |
|    interface (see :ref:`attributes-objects`) containing the attributes of
 | |
|    the element.  The object passed as *attrs* may be re-used by the parser; holding
 | |
|    on to a reference to it is not a reliable way to keep a copy of the attributes.
 | |
|    To keep a copy of the attributes, use the :meth:`copy` method of the *attrs*
 | |
|    object.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.endElement(name)
 | |
| 
 | |
|    Signals the end of an element in non-namespace mode.
 | |
| 
 | |
|    The *name* parameter contains the name of the element type, just as with the
 | |
|    :meth:`startElement` event.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.startElementNS(name, qname, attrs)
 | |
| 
 | |
|    Signals the start of an element in namespace mode.
 | |
| 
 | |
|    The *name* parameter contains the name of the element type as a ``(uri,
 | |
|    localname)`` tuple, the *qname* parameter contains the raw XML 1.0 name used in
 | |
|    the source document, and the *attrs* parameter holds an instance of the
 | |
|    :class:`AttributesNS` interface (see :ref:`attributes-ns-objects`)
 | |
|    containing the attributes of the element.  If no namespace is associated with
 | |
|    the element, the *uri* component of *name* will be ``None``.  The object passed
 | |
|    as *attrs* may be re-used by the parser; holding on to a reference to it is not
 | |
|    a reliable way to keep a copy of the attributes.  To keep a copy of the
 | |
|    attributes, use the :meth:`copy` method of the *attrs* object.
 | |
| 
 | |
|    Parsers may set the *qname* parameter to ``None``, unless the
 | |
|    ``feature_namespace_prefixes`` feature is activated.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.endElementNS(name, qname)
 | |
| 
 | |
|    Signals the end of an element in namespace mode.
 | |
| 
 | |
|    The *name* parameter contains the name of the element type, just as with the
 | |
|    :meth:`startElementNS` method, likewise the *qname* parameter.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.characters(content)
 | |
| 
 | |
|    Receive notification of character data.
 | |
| 
 | |
|    The Parser will call this method to report each chunk of character data. SAX
 | |
|    parsers may return all contiguous character data in a single chunk, or they may
 | |
|    split it into several chunks; however, all of the characters in any single event
 | |
|    must come from the same external entity so that the Locator provides useful
 | |
|    information.
 | |
| 
 | |
|    *content* may be a string or bytes instance; the ``expat`` reader module
 | |
|    always produces strings.
 | |
| 
 | |
|    .. note::
 | |
| 
 | |
|       The earlier SAX 1 interface provided by the Python XML Special Interest Group
 | |
|       used a more Java-like interface for this method.  Since most parsers used from
 | |
|       Python did not take advantage of the older interface, the simpler signature was
 | |
|       chosen to replace it.  To convert old code to the new interface, use *content*
 | |
|       instead of slicing content with the old *offset* and *length* parameters.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.ignorableWhitespace(whitespace)
 | |
| 
 | |
|    Receive notification of ignorable whitespace in element content.
 | |
| 
 | |
|    Validating Parsers must use this method to report each chunk of ignorable
 | |
|    whitespace (see the W3C XML 1.0 recommendation, section 2.10): non-validating
 | |
|    parsers may also use this method if they are capable of parsing and using
 | |
|    content models.
 | |
| 
 | |
|    SAX parsers may return all contiguous whitespace in a single chunk, or they may
 | |
|    split it into several chunks; however, all of the characters in any single event
 | |
|    must come from the same external entity, so that the Locator provides useful
 | |
|    information.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.processingInstruction(target, data)
 | |
| 
 | |
|    Receive notification of a processing instruction.
 | |
| 
 | |
|    The Parser will invoke this method once for each processing instruction found:
 | |
|    note that processing instructions may occur before or after the main document
 | |
|    element.
 | |
| 
 | |
|    A SAX parser should never report an XML declaration (XML 1.0, section 2.8) or a
 | |
|    text declaration (XML 1.0, section 4.3.1) using this method.
 | |
| 
 | |
| 
 | |
| .. method:: ContentHandler.skippedEntity(name)
 | |
| 
 | |
|    Receive notification of a skipped entity.
 | |
| 
 | |
|    The Parser will invoke this method once for each entity skipped. Non-validating
 | |
|    processors may skip entities if they have not seen the declarations (because,
 | |
|    for example, the entity was declared in an external DTD subset). All processors
 | |
|    may skip external entities, depending on the values of the
 | |
|    ``feature_external_ges`` and the ``feature_external_pes`` properties.
 | |
| 
 | |
| 
 | |
| .. _dtd-handler-objects:
 | |
| 
 | |
| DTDHandler Objects
 | |
| ------------------
 | |
| 
 | |
| :class:`DTDHandler` instances provide the following methods:
 | |
| 
 | |
| 
 | |
| .. method:: DTDHandler.notationDecl(name, publicId, systemId)
 | |
| 
 | |
|    Handle a notation declaration event.
 | |
| 
 | |
| 
 | |
| .. method:: DTDHandler.unparsedEntityDecl(name, publicId, systemId, ndata)
 | |
| 
 | |
|    Handle an unparsed entity declaration event.
 | |
| 
 | |
| 
 | |
| .. _entity-resolver-objects:
 | |
| 
 | |
| EntityResolver Objects
 | |
| ----------------------
 | |
| 
 | |
| 
 | |
| .. method:: EntityResolver.resolveEntity(publicId, systemId)
 | |
| 
 | |
|    Resolve the system identifier of an entity and return either the system
 | |
|    identifier to read from as a string, or an InputSource to read from. The default
 | |
|    implementation returns *systemId*.
 | |
| 
 | |
| 
 | |
| .. _sax-error-handler:
 | |
| 
 | |
| ErrorHandler Objects
 | |
| --------------------
 | |
| 
 | |
| Objects with this interface are used to receive error and warning information
 | |
| from the :class:`XMLReader`.  If you create an object that implements this
 | |
| interface, then register the object with your :class:`XMLReader`, the parser
 | |
| will call the methods in your object to report all warnings and errors. There
 | |
| are three levels of errors available: warnings, (possibly) recoverable errors,
 | |
| and unrecoverable errors.  All methods take a :exc:`SAXParseException` as the
 | |
| only parameter.  Errors and warnings may be converted to an exception by raising
 | |
| the passed-in exception object.
 | |
| 
 | |
| 
 | |
| .. method:: ErrorHandler.error(exception)
 | |
| 
 | |
|    Called when the parser encounters a recoverable error.  If this method does not
 | |
|    raise an exception, parsing may continue, but further document information
 | |
|    should not be expected by the application.  Allowing the parser to continue may
 | |
|    allow additional errors to be discovered in the input document.
 | |
| 
 | |
| 
 | |
| .. method:: ErrorHandler.fatalError(exception)
 | |
| 
 | |
|    Called when the parser encounters an error it cannot recover from; parsing is
 | |
|    expected to terminate when this method returns.
 | |
| 
 | |
| 
 | |
| .. method:: ErrorHandler.warning(exception)
 | |
| 
 | |
|    Called when the parser presents minor warning information to the application.
 | |
|    Parsing is expected to continue when this method returns, and document
 | |
|    information will continue to be passed to the application. Raising an exception
 | |
|    in this method will cause parsing to end.
 | |
| 
 |