Sax
This wiki is in the process of being archived due to lack of usage and the resources necessary to serve it — predominately to bots, crawlers, and LLM companies. Edits are discouraged.
Pages are preserved as they were at the time of archival. For current information, please visit python.org.
If a change to this archive is absolutely needed, requests can be made via the infrastructure@python.org mailing list.
"Sax" is an XML parser that operates element by element, line by line.
MiniDom sucks up an entire XML file, holds it in memory, and lets you work with it. Sax, on the other hand, emits events as it goes step by step through the file.
NOTE: A similarly fast but much simpler way to extract information from an XML document in an event-driven, memory efficient fashion is ElementTree.iterparse().
Example
1 import xml.sax
2
3 class InkscapeSvgHandler(xml.sax.ContentHandler):
4 def startElement(self, name, attrs):
5 if name == "svg":
6 for (k,v) in attrs.items():
7 print k + " " + v
8
9 parser = xml.sax.make_parser()
10 parser.setContentHandler(InkscapeSvgHandler())
11 parser.parse(open("svg.xml","r"))
Links
HtmlParser -- similar module, tailored to HTML interpretation
Python Library Reference, xml.sax -- API documentation
Python XML FAQ and How-to -- describes sax & MiniDom
SAX: The Simple API for XML -- wordy tutorial
Charming Python:Revisiting XML tools for Python -- kind of old