MarcEdit PreUpdate Notes

Three significant changes will be coming as part of MarcEdit’s Sunday update. These impact the MARCEngine, Regular Expression Processing, and the Linked Data Platform.

MARCEngine Changes:

In May 2016, I had to make some changes to the MARCEngine to remove some of the convenience functions that allowed mnemonic processing to occur on UTF8 data or HTML entitles on MARC8 data. Neither of these should happen, and there were issues that came up related to titles that actually included HTML entitles in the titles. So, this kind of processing had to be removed.

Over the past few months, I’ve been re-evaluating how these functions use to work, and have been bringing them back. Sunday will mark the reintroduction of many of these functions (though, not the HTML entity translation when it’s not specifically appropriate). The upside, is that coupled with the new encoding work, the tool will be able to support more mixed encoding use-cases.

Regular Expression Processing

The most significant changes to the Regular Expression engine processing is how the multi-line processing works in the Replace Function. To protect users, I’d set up the match any character “.” match all characters but a new line character. This was done to keep users from accidently deleting data. However, this meant that when using the multi-line processing option, it really only worked when fields were side by side. By removing this limitation, users working with the multi-line option will now have full access to the entire record with the Regular Expression processing. And with that, a word of warning…be careful. The Multi-line processing is the easiest way to accidently delete bibliographic data through greedy matches.

Additionally, I’ve added an option to the Replace Function dialog that makes it easier to know that MarcEdit has this option. Right now, users have to know that you need to add a /m to the end of your expression to initiate the multi-line mode. You can still do that — but for users that don’t know that this is the case, a new option has been added to turn on the Multiline option (see below).

The MultiLine Evaluation will be enabled and selectable when the Use regular expressions option is checked.

Linked Data Platform Changes

The Linked Data Platform will be seeing two significant changes. The first change is occurring in the code to support linking fields like the 880. This has meant adding a new special processing instruction “linking”, which can now be used to perform reconciliation against data in these fields. This is particularly important for Asian languages, Arabic languages, Hebrew, etc.

The second change is in the rules file itself. I’ve profiled the 880 field, as well as a wide range of other collections.

Finally, I’ve added a note to the Main Window that helps users find their rules file for edit, as well as points to the knowledge-base articles and videos explaining the process. (see below)

Other Changes:

Other changes that will be made in this update:

ISSN Report — tweaked the process due to a bug.
MARCCompare — added a new output type; in addition to the HTML output, there will be a diff file output
Preferences Window/Language — Added a help icon that points to the knowledge-base articles related to fonts and font recommendations.

These changes will be part of the Sunday update, and at this point, look like they will be applicable to all versions of MarcEdit (though, I do have a few UI tweaks that I need to complete on the mac side).

If you have questions, let me know. Otherwise, these changes will be made available on ~~9/11/2016~~ 9/12/2016 [evening].

–tr