Several years ago, I started working on the XML parser that Gilbert Baumann had written as part of his web browser called "Closure". Since then, "Closure XML" (or cxml for short) has developed into a set of little libraries, one main goal being completeness and correctness with regard to the various standards they are following.
And standards abound in XML land, which is nice for implementors (thanks to the good test suites!) and nice for users (because the specs partially serve as documentation, and make it easy to transition between different languages implementing them). But I've always tried to release cxml with enough documentation to get users started for all the parts that are implementation-specific. And not all areas are covered by standards: Of course, the document format itself is specified strictly; the same goes for XPath, XSLT, schemas, etc.
SAX is a classic Java API. It defines a protocol of methods that get called by an XML parser, and each method call signifies an event (e.g. that the parser saw a XML tag). In cxml, SAX is one of two fundamental APIs offered (the other being a StAX-like pull-based interface), and it's essential to its inner workings. Yet I had never bothered to document it fully. For one thing, everyone seemed to know SAX from Java anyway. It's also hidden from view for most users. And ultimately, it's just a list of generic functions, right?
Technically it is just that, and yet it's central to communication between cxml's libraries, and it makes parsing and serialization in cxml modular and reusable. Hence some users had long suggested to me that I should explain SAX in full.
So here it is: The SAX overview.
TL;DR: Skip to the link above.