Parser snippets

On the oft-informative IRC channel #code4lib, jrochkind mentioned working with a new ruby parser framework parslet.  It runs based on a parsing expression grammar (PEG), essentially a notation for language specification.  You can think of it as the kinder, gentler Extended Backus-Naur Form.  Not as complete, but possibly more useful.

Then the newly core perl6 rules were mentioned as another widely available basis for constructing parsers.  It seems commonly used of languages have come a long way towards the functionality once available mainly from dedicated tools like ANTLR.

Armed with tools like these and adequately specified, er… specifications, it is possible to imagine standards bodies would be able to produce new reference implementations for versions of important specs like EDIFACT, SIP2 (or SIP3)… even MARC21.  That is, we could have additional framework/validator available for these otherwise arcane formats that was authoritative.  And not one that runs on top of a FoxPro database in Windows NT.  Oh, sweet sweet reference implementation…

If you’re interested in how these tools might be used in libraries soon, hop on channel and ask about new parsing in Blacklight.

About Joe Atzberger

Joe Atzberger (atz) is a library hacker in Palo Alto, CA. He worked with Galen at both LibLime and Equinox Software, Inc. as an open source developer on Koha and Evergreen. Joe currently works on Hydra and institutional digital repository infrastructure at Stanford.

Comments are closed.