mirror of
https://github.com/python/cpython.git
synced 2025-08-01 15:43:13 +00:00

This changes the main documentation, doc strings, source code comments, and a couple error messages in the test suite. In some cases the word was removed or edited some other way to fix the grammar.
128 lines
4.6 KiB
ReStructuredText
128 lines
4.6 KiB
ReStructuredText
:mod:`xml.dom.pulldom` --- Support for building partial DOM trees
|
|
=================================================================
|
|
|
|
.. module:: xml.dom.pulldom
|
|
:synopsis: Support for building partial DOM trees from SAX events.
|
|
.. moduleauthor:: Paul Prescod <paul@prescod.net>
|
|
|
|
**Source code:** :source:`Lib/xml/dom/pulldom.py`
|
|
|
|
--------------
|
|
|
|
The :mod:`xml.dom.pulldom` module provides a "pull parser" which can also be
|
|
asked to produce DOM-accessible fragments of the document where necessary. The
|
|
basic concept involves pulling "events" from a stream of incoming XML and
|
|
processing them. In contrast to SAX which also employs an event-driven
|
|
processing model together with callbacks, the user of a pull parser is
|
|
responsible for explicitly pulling events from the stream, looping over those
|
|
events until either processing is finished or an error condition occurs.
|
|
|
|
|
|
.. warning::
|
|
|
|
The :mod:`xml.dom.pulldom` module is not secure against
|
|
maliciously constructed data. If you need to parse untrusted or
|
|
unauthenticated data see :ref:`xml-vulnerabilities`.
|
|
|
|
|
|
Example::
|
|
|
|
from xml.dom import pulldom
|
|
|
|
doc = pulldom.parse('sales_items.xml')
|
|
for event, node in doc:
|
|
if event == pulldom.START_ELEMENT and node.tagName == 'item':
|
|
if int(node.getAttribute('price')) > 50:
|
|
doc.expandNode(node)
|
|
print(node.toxml())
|
|
|
|
``event`` is a constant and can be one of:
|
|
|
|
* :data:`START_ELEMENT`
|
|
* :data:`END_ELEMENT`
|
|
* :data:`COMMENT`
|
|
* :data:`START_DOCUMENT`
|
|
* :data:`END_DOCUMENT`
|
|
* :data:`CHARACTERS`
|
|
* :data:`PROCESSING_INSTRUCTION`
|
|
* :data:`IGNORABLE_WHITESPACE`
|
|
|
|
``node`` is an object of type :class:`xml.dom.minidom.Document`,
|
|
:class:`xml.dom.minidom.Element` or :class:`xml.dom.minidom.Text`.
|
|
|
|
Since the document is treated as a "flat" stream of events, the document "tree"
|
|
is implicitly traversed and the desired elements are found regardless of their
|
|
depth in the tree. In other words, one does not need to consider hierarchical
|
|
issues such as recursive searching of the document nodes, although if the
|
|
context of elements were important, one would either need to maintain some
|
|
context-related state (i.e. remembering where one is in the document at any
|
|
given point) or to make use of the :func:`DOMEventStream.expandNode` method
|
|
and switch to DOM-related processing.
|
|
|
|
|
|
.. class:: PullDom(documentFactory=None)
|
|
|
|
Subclass of :class:`xml.sax.handler.ContentHandler`.
|
|
|
|
|
|
.. class:: SAX2DOM(documentFactory=None)
|
|
|
|
Subclass of :class:`xml.sax.handler.ContentHandler`.
|
|
|
|
|
|
.. function:: parse(stream_or_string, parser=None, bufsize=None)
|
|
|
|
Return a :class:`DOMEventStream` from the given input. *stream_or_string* may be
|
|
either a file name, or a file-like object. *parser*, if given, must be a
|
|
:class:`~xml.sax.xmlreader.XMLReader` object. This function will change the
|
|
document handler of the
|
|
parser and activate namespace support; other parser configuration (like
|
|
setting an entity resolver) must have been done in advance.
|
|
|
|
If you have XML in a string, you can use the :func:`parseString` function instead:
|
|
|
|
.. function:: parseString(string, parser=None)
|
|
|
|
Return a :class:`DOMEventStream` that represents the (Unicode) *string*.
|
|
|
|
.. data:: default_bufsize
|
|
|
|
Default value for the *bufsize* parameter to :func:`parse`.
|
|
|
|
The value of this variable can be changed before calling :func:`parse` and
|
|
the new value will take effect.
|
|
|
|
.. _domeventstream-objects:
|
|
|
|
DOMEventStream Objects
|
|
----------------------
|
|
|
|
.. class:: DOMEventStream(stream, parser, bufsize)
|
|
|
|
|
|
.. method:: getEvent()
|
|
|
|
Return a tuple containing *event* and the current *node* as
|
|
:class:`xml.dom.minidom.Document` if event equals :data:`START_DOCUMENT`,
|
|
:class:`xml.dom.minidom.Element` if event equals :data:`START_ELEMENT` or
|
|
:data:`END_ELEMENT` or :class:`xml.dom.minidom.Text` if event equals
|
|
:data:`CHARACTERS`.
|
|
The current node does not contain informations about its children, unless
|
|
:func:`expandNode` is called.
|
|
|
|
.. method:: expandNode(node)
|
|
|
|
Expands all children of *node* into *node*. Example::
|
|
|
|
xml = '<html><title>Foo</title> <p>Some text <div>and more</div></p> </html>'
|
|
doc = pulldom.parseString(xml)
|
|
for event, node in doc:
|
|
if event == pulldom.START_ELEMENT and node.tagName == 'p':
|
|
# Following statement only prints '<p/>'
|
|
print(node.toxml())
|
|
doc.exandNode(node)
|
|
# Following statement prints node with all its children '<p>Some text <div>and more</div></p>'
|
|
print(node.toxml())
|
|
|
|
.. method:: DOMEventStream.reset()
|
|
|