Big XML files, REXML and learning about stream parsers

After taking the easy route and building some XML check test scripts using Ruby and REXML’s DOM access, I decided that I really didn’t want my computer grinding to a halt for a whole day while it parsed a gig and a half of XML. So it was time to try a streaming parser. Unfortunately, the REXML website seemed to be unavailable. Which led me to this very nice tutorial on Jan Vereecken’s blog:

http://www.janvereecken.com/2007/4/11/event-driven-xml-parser-in-ruby

I’m pretty sure it’s nicer than the one on the REXML site, but I will have to wait and see.

Anyway, thanks Jan!

Leave a Reply

Your email address will not be published. Required fields are marked *