More than a decade ago, Gilbert Baumann started writing the Closure web browser. It includes a great HTML parser, written all in Lisp.
Released today, Closure HTML is a stand-alone version of the parser.
It supports HTML 4, understands malformed HTML, and can (optionally) be used in conjunction with Closure XML and its data structures.
An easy way to get started with Closure HTML itself is with its LHTML builder, which represents HTML elements as simple lisp lists.
Together, the two parsers can be used to turn HTML into XHTML or vice versa, and in particular to parse HTML into DOM or STP. Even for users who only parse and work with XHTML internally, the new code can be useful to emit normal HTML 4 as the last step of processing.