Quick and dirty script for converting from NXT/AMI format to SRI's .ref format (similar to .stm)

This script was written originally in late 2005 when I thought the AMI meetings were going to part of the training set. It was abandoned when I realized the meetings were not yet public. I've cleaned it up only slightly, so beware things like hard-coded paths and SRI/ICSI specific assumptions. Also, I have not included a dictionary, since I originally used the SRI dictionary, and I'm not sure if I can redistribute it. The dictionary is only used to resolve some hyphenation rules, and it's just a simple list of words. It should be easy to generate for your own system.

nxt2ref.pl
Perl script to do the conversion. See the file itself for comments and usage. It requires XML::Parser, available from CPAN. Sample usage:
nxt2ref.pl -m ami.mapping -d meetings-2005.vocab ES2007b.C > ES2007b.C.ref

ami.map
A simple word mapping file, consisting mostly of British to English conversions and some SRI/ICSI specific conventions.