2007-03-28

Federal trojan horse

(Apologies for non-Lisp content.)

German police is going to install surveillance software on suspects' computers.

They are going to install it online through the Internet without anyone noticing, and of course they will do it without exploiting security vulnerabilities. Here is what the president of the Federal Criminal Police Office has to say about it (in german):

taz: How will the "Online-Search" of a computer work technically then?

Ziercke: Naturally I cannot discuss that publically.

2007-03-10

clbuild on cygwin

Step 1: SBCL

Install SBCL using its Windows .msi installer.

Step 2: Cygwin and darcs

Since clbuild is a shell script, you need to install cygwin, even though SBCL itself does not depend on it.

Get it from cygwin.org. Make sure to select all packages that clbuild uses to download software. You will need at least cvs, subversion, and wget. In addition, you might want to install emacs (for slime) and X (for CLIM with the CLX backend).

[EDIT: Don't use emacs from cygwin, install the native Windows port of Emacs instead.--2007-06-24]

Not included with cygwin is darcs, but it has a cygwin port, so download it manually from darcs.net and add it to your $PATH.

Step 3: clbuild

Cygwin support is new in my clbuild tree, so until another clbuild hacker merges those changes, fetch it from:

$ darcs get http://www.lichteblau.com/blubba/clbuild


Step 4: Bleeding edge

ASDF as included with SBCL 1.0.3 does not work with clbuild, so you need to replace it with a version including my patch for Windows shortcut support.

Download asdf.lisp and asdf.fasl and copy them into the asdf/ subdirectory of your SBCL installation, replacing the original versions. (diff)

[EDIT: the asdf.fasl linked there isn't up-to-date anymore, but asdf.lisp and asdf.diff are still there, including some fixes. Drop them into your source tree and recompile SBCL. --2007-06-24]

Run clbuild

That's it. Now just run clbuild:

$ cd clbuild
clbuild$ chmod +x clbuild
clbuild$ ./clbuild build

To run CLIM applications using the CLX backend, start an X server first and set $DISPLAY accordingly. (It appears to be necessary to specify an IP address in $DISPLAY so that CLX does not attempt a unix domain socket connection.)

clbuild$ X&
clbuild$ export DISPLAY=127.0.0.1:0
clbuild$ ./clbuild listener


Optional: Gtkairo

To try CLIM's gtkairo backend instead, download GTK+ from gimp-win.sf.net. (For some reason, the installer is wrapped in a zip file.)

Add the bin directory of that GTK+ installation to your PATH and configure clbuild to use gtkairo:

clbuild$ export PATH="/cygdrive/c/Programme/gtk-2.10/bin:$PATH"
clbuild$ export CLIM_BACKEND=gtkairo
clbuild$ ./clbuild listener


ObScreenshot of the listener.

(Expect to find some gtkairo/Windows repainting bugs though.)

2007-03-03

Klacks parsing

Closure XML has been based on a SAX-like API for several years now (in addition to the DOM implementation on top of that). But although the pervasive use of SAX within CXML itself has been a success story, most users seem to prefer DOM usage over SAX handler hacks. Anyone who has ever parsed a non-trivial schema using SAX knows why: Maintaining separate start-element and end-element methods is very inconvenient. Code ends up dispatching on tag names using huge case forms while doing all bookkeeping manually in slots of the handler instance.

Starting with the current release of CXML, there is now a new parser interface called Klacks.

Similar to StAX, the new interface is more convenient than SAX, while still providing the same features as the old one, including validation.

Basically, the klacks parser can be used as a (rather sophisticated) tokenizer, and you get to write a recursive descent parser based on that.

SAX and StAX are Java's protocols for XML parsing. Sometimes they are being referred to as low-level interfaces for "expert" use only (the suggested alternative being something like DOM), but their purpose is really to parse XML without building an in-memory representation.

Low-level or not, they are the right choice when parsing into application-defined data structures or when performing simple on-the-fly transformation of XML data as it is being read.

In SAX, an XML parser will process the entire document in one go, emitting events as it sees them. User code needs to implement its own handler class, with methods for the events it cares about. The SAX concept is known as "push-based".

In contrast, the "pull-based" StAX parsing model is similar to working with an input stream. User code starts by creating an input stream object for the XML document, then reads events from that stream one by one. (Klacks uses the term source instead of stream, to avoid confusion with Common Lisp streams.)

API design choices. StAX distinguishes between a high-level API, which creates a Java object for each event, and the low-level API, which just returns an enum indicating the type of event, and has separate methods to access the current event's data.

Klacks has just one set of functions for both purposes, since it seemed more lispy to use multiple values. Instead of returning just a keyword indicating the event type, the main klacks functions always include useful event data as additional return values.

Java's StAX also includes classes for XML serialization. No such extension was needed for CXML, since it already supports convenient serialization using SAX events. The with-element macro and related functions make generation of those events easy.

Simple klacks parsing example:
* (defparameter *source* (cxml:make-source "<example>text</example>"))
* (klacks:peek-next *source*)
:START-DOCUMENT
* (klacks:peek-next *source*)
:START-ELEMENT
NIL                      ;namespace URI
"example"                ;local name
"example"                ;qualified name
* ...