MarcEdit and Solr

By reeset / On / In MarcEdit

With the preconference at Code4lib coming, folks are looking for ways to get their MARC data into a format that Solr can load.  Andrew Nagy has made an XSLT that can convert MARCXML data to a Solr format, and David Bigwood notes that MarcEdit can be used to generate those MARCXML records.  This is true — but you could also generate the Solr records directly.  Basically, you just need to register the crosswalk with MarcEdit and then you can process items directly into the Solr format from MARC.  I’d left some instructions as a comment on David’s page, however, for those that might not see it and find this helpful.  I’ve reproduced my comment from here below.



Actually, you could use MarcEdit to go straight from MARC to the Solr syntax — though, you’d want to modify the posted stylesheet to include the marc: namespace. This way, the tool could process files with or without that namespace.
The way that you make it work is simply register the crosswalk with MarcEdit. Since some folks aren’t sure how this works — I’ve quickly recorded a quick avi file of what that looks like. See: Adding and Using MARC=>Solr crosswalk for the AVI file showing how to register the MARC=>Solr crosswalk. BTW, the avi file is ~29 MB.
I also modified the crosswalk that you’d linked to so that it works better in MarcEdit. Since MarcEdit uses the marc namespace by default, xslt stylesheets work best in MarcEdit if they include the namespace. This way, MarcEdit can process items with namespace and without. Here’s the stylesheet with the revisions made (BTW, this is the stylesheet I used in my example): Modified MARC21XML=>Solr XSLT