What is MARCCompare/RobertCompare?
Very rarely do I create programs for individuals to meet very specific user needs. I’ve always taken the approach with MarcEdit that tools should be generalizable, and not tied to a specific individual or project. RobertCompare was different. The tool was created to support Mr.Robert (Bob) Ellett’s (ALA Tribute, Link to Dissertation Record in WorldCat) research for his Ph.D. dissertation, and only after completion, was the tool generalized for wider use.
When I moved MarcEdit from the 4.x to the 5.x codebase, I dropped this utility because it had seemed to have run its course. This was something Bob would periodically give me a hard time about — I think that he liked the idea of RobertCompare kicking around. Of course, the program was terribly complicated, and without folks asking for it, converting the code from assembly to C# just wasn’t a high priority.
Well, that changed last year when Bob suddenly passed away. I liked Bob a lot — he was immeasurably kind and easy to get along with. After his passing, I decided I wanted to bring RobertCompare back…I wanted to do something to remember my friend. It’s taken a lot more time than I’d hoped, in part due to a move, a job change, and the complexity of the code. However, after an extended absence, RobertCompare is being reintroduced back into MarcEdiit with MarcEdit 6.0.
The original version of RobertCompare was designed to answer a very specific set of questions. The program didn’t just look for differences between records, but rather, utilizing a probability engine, made determinations regarding the types of changes that had been made in the records. Bob’s research centered around the use of PCC records at non-PCC libraries, and he was particularly interested in the types of changes these libraries were making to the records when downloading them for use. The original version of RobertCompare was very good at analyzing record sets and generating a change history based on the current state of the records. But the program was incredibly complicated and slow…really, really slow.
In order to make this tool more multi-use, I’ve removed much of the code centered around probability matrix, and instead created a tool that utilizes a differential equation to generate an output file that graphically represents the changes between MARC files. The output of the file is in HTML and at this point, pretty simple – but has been created in a way that I should be able to add additional functionality if this tool proves to have utility within the community.
So what does it look like? The program is pretty straightforward. There is a home menu where in identify the two files that you want to compare, and then a place to designate a save file path.
Figure 1: MARCCompare/RobertCompare main window
The program can take MARC files and mnemonic files and compare them to determine what changes have been made between each record. At this point, the files to be compared need to be in the same order. This has been done primarily for performance reasons, as it allows the program to very quickly chew through very large files which was what I was looking for as part of this re-release.
As noted above, the output of the files has changed. Rather than breaking down changes into categories in an attempt to determine if changes were updated fields, new fields or deleted field data – the program now just notes additions/changes and deletions to the record and outputs this as an HTML record. Figure 2 shows a sample of what the report might look like (format is slightly fluid and still changing).
Figure 2: MARCCompare/RobertCompare output
While I’m not sure that RobertCompare was ever widely used by the MarcEdit community, I do know that it had its champions. Over the past year, I’ve heard from a handful of users asking about the tool, and letting me know that they still have MarcEdit 4.0 on their systems specifically to utilize this program. Hopefully by adding this tool back into MarcEdit, they will finally be able to send MarcEdit 4.x into retirement and jump to the current version of the application. For me personally, working on this tool again was a chance to remember a very good man, and finish something that I know probably would have given him a good laugh.