Jun 212014

Thought I’d give the folks that are interested an idea of where I am with the MarcEdit update.  In terms of new features, that is pretty much set at this point.  I have one feature request regarding conditional field editing by field position that I’m still mulling over, but otherwise, I’m pretty much finished.  The last things todo right now is to finish the installers for Linux and Mac systems, and then make sure that the installs work exactly as I want them to.  One thing that I have found with the Linux and Mac systems – the Z39.50 dependencies are one of the biggest dependency hurdles (I can’t really automate their install well) – so I’m working on a method of creating a Z39.50 web service at marcedit.reeset.net that I can then call via non-Windows systems.  This will be functionally equivalent with the locally loaded Z39.50 libraries, though there will likely be a performance hit (small one, you shouldn’t notice too much) – but will remove a major dependency making the program much easier to install.  Also, I’m just not sure how often the Z39.50 client actually gets used at this point (when can we finally phase that out) – so I’m hoping this will be a good enough compromise. 

MarcEdit on Linux:

At this point, I’m testing MarcEdit on Suse and it looks to be going well.  The installer is now a single .sh file, that will expand creating the directory, run the bootloader, and create a shortcut on the desktop.  What it doesn’t do yet is make sure Mono is installed – not certain if that will be part of the first installer, but this should get you to the point that if you have mono installed, you will just run the installer and an icon will be on the desktop ready to be activated. 

Couple of the current screenshots:

Figure 1: Start Screen on Linux

Figure 2: MarcEditor

Figure 3: Delimited Text Translator

Figure 4: RDA Helper


Figure 5: MARC Tools Window

At this point, on Linux, the version looks and feels pretty much like you’d expect the Windows version to work.  I’m testing performance, especially with larger files in the Editor (for rendering) and it looks like all UI issues should be dealt with in MarcEdit 6.  I’m starting the process of doing the same type of testing using the current version of OS X with Mono 3.4 – hopefully I’ll see the same UI smoothness.

Some Stats on the Update

Some stats – refactoring, I’ve removed close to 15,000 lines of code streamlining older processes (these are things you likely won’t see, outside of maybe a few things being faster).  New code – I’ve written close to 27,000 lines of code developing new features or enhancements for this update.  Much of this has been around multiple platform support, new functions in the editing tools, etc.  Once MARCCompare has been completed, that number will jump to around 37,000 lines – so this will be a significant update in which a number of tools and functions have had significant revisions made too improve code maintenance, performance, add new features or correct bugs, or all three. 

Below you will find a list of topics the represent some of the areas where changes are being made.  This is a subset of a report from the internal issue tracker I use for tracking issues/enhancement requests.  At this point, it still looks like this update will be scheduled for Mid-July.



Tracking Report



Sharing Tasks which refer to other tasks


Add/Delete Field — Find What Field


Refresh the Delimited Text Translator


MARC Tools window refresh


Changing Capital Letters with Diacritics to lower case


RDA Helper


Script Wizard — add/delete field with modifier


Adding multiple fields in the Add/Delete Function


UI Issue — Validation


Task List — Swap Field not saving Process one per field


Allow tasks to have comments


Edit Subfield and Swap Field Edits


Task Manager & Task List tool


Task List: Allowing Description to show in Task Manager and Task Editor


Plugin Updates — make it easier to track plugin updates


Ability to import RIS format


Mac Fonts


Prompt for confirmation of task deletion


Koha Integration – unable to authenticate when using self-signed certificates


merge records tool — customizing match points causes merges to not occur


Validate Temp File Path Before Generating files


 Posted by at 8:53 am
Jun 122014

This summer, I have been spending a lot of my free development preparing MarcEdit to make the transition from 5.9 to 6.0.  This will be a big update, and at the same time, and update that I’m not sure folks will necessarily see many of the changes.  But maybe that will be a good thing, I don’t know.  So, a couple of notes about the update.

User Interface Changes

So, MarcEdit has been around for a very long time – 1999 to be exact.  Because of that, there are some very old tools in the application…tools who’s interface haven’t been updated in 6-7 years – even if the underlying functionality has changed.  Well, I’m taking the time to refresh a number of the forms found within the application.  The places where most users will see these changes are in the Delimited Text Translator (the Windows 2000 Wizard interface is going away) and the MARC Tools interface.  There are a few others that will see updates; but these are the two that you’ll see for sure.

Delimited Text Translator (screen 1)


Delimited Text Translator (screen 2)


MARC Tools Revisions (proposed)


These are proposed changes – but are being done to make the look more consistent as well as make some screens easier to work with (the MARC Tools window for instance was getting really long).

Bug fixes

For two months, I’ve been going through my internal ticketing system, and closing every open issue.  When I post this update, every bug report that has been saved, will be closed.  This has required rewriting a lot of code within the application – which for many older functions – is definitely a good thing.  At present – this means that 13 bug tickets have been closed.  All are minor, with general workarounds – but this should have a practical impact for folks.

New features

At present, 7 enhancements/new features have been added to the program.  The remaining change to be completed (if it’s completed) will be the re-inclusion of MARCCompare (or RobertCompare as I fondly referred to it).  This is a specialized diff tool for comparing MARC records that is smarter than just a plain diff because it takes into account MARC structured data.

Cross Platform Work

One of the pseudo supported options in MarcEdit has been the ability to run the tool on non-windows platforms.  Being written in C#, the program runs fine under MONO on both Linux and Mac systems.  However, this has traditionally worked much better for Linux users because: MONO was created by a bunch of people on Linux, and I work a lot on Linux so I would test there.  One of the big efforts this time around has been to try and identify where UI issues or other issues exist and put solid support for folks interested in working with MarcEdit within non-Windows environments.  The tool will likely always run best there – but I’d like to provide a version with installers, for both Linux and Mac systems and 6.0 will represent the first attempt at that.  I’ll admit, I have some trepidation around doing this and providing more formalized support because I don’t own a Mac so I won’t get to test and do the kind of dedicated development as I might like – but I have regular enough access to one that I’m willing to give it a go.  At the very least – users will get a much easier process for installing and that at least is a win.

Go live date

This is still in the air a little bit.  I’d hoped for early July but my wife and I are buying a house, I’m doing a little travelling for work – so this is definitely pushing things back.  But ideally, these changes will start being rolled out in July 2014.

If you have questions – feel free to give me a holler.


 Posted by at 1:15 am
May 142014

Since last December, I’ve had the opportunity to spend a good deal of time working with the OCLC WorldCat Metadata API.  The focus was primarily around kicking the tires, and then eventually developing some integration components with MarcEdit, as well as a C# library (https://github.com/reeset/oclc_api) for those that may have use of such things.

However, most folks in libraries tend to not use C# – they tend to focus on lighter-weight languages like ruby, python, or PHP.  Well, I work in ruby as well, so I decided to take the C# library and port it to Ruby.  The resulting code can be found here: https://github.com/reeset/wc_metadata_api

It’s pretty straightforward – and provides wrappers for all the current functionality found in the API, with the caveat being that the defined transfer format between OCLC and the client is in MARCXML, and response formats are in application/atom+xml.  The API supports a handful of other formats, but I’ve standardized on MARCXML since it’s well understood and there are lots of tools that can work with it.

The code isn’t well commented at this point.  I’ll take some time over the next few days to add some commenting, improve exception handling, and possibly add a few helper functions to make message processing easier – but this should help anyone interested in knowing how the API works get a better understanding.


Code Example:

** Get Local Bib Record

require ‘rubygems’
require ‘wc_metadata_api’

key = ‘[your key]‘
secret = ‘[your secret]‘
principalid = ‘[your principal_id]‘
principaldns = ‘[your principal_idns]‘
schema = ‘LibraryOfCongress’
holdingLibraryCode=’[your holding code]‘
instSymbol = ‘[your oclc symbol]‘

client = WC_METADATA_API::Client.new(:wskey => key, :secret => secret, :principalID => principalid, :principalDNS => principaldns, :debug =>false)

response = client.WorldCatReadLocalBibRecord(:oclcNumber =>’338544583′, :schema => schema, :holdingLibraryCode => holdingLibraryCode, :instSymbol => instSymbol)

puts response

 Posted by at 10:31 pm
Apr 042014

One of the new functions in MarcEdit is the inclusion of a Field Edit Data function.  For the most part, batch edits on field data has been handled primarily via regular expressions using the traditional replace function.  For example, if I had the following field:

=999  \\$zdata1$ydata2

And I wanted to swap the subfield order, I’d use a regular expression in the Replace function and construct the following:
Find: (=999.{4})(\$z.*[^$])(\$y.*)
Replace: $1$3$2

This works well – when needing to do advanced edits.  The problem was that any field edits that didn’t fit into the Edit Subfield tool needed to be done as a regular expression.  In an effort to simplify this process, I’ve introduced an Edit Field Data tool.


This tool exposes the data, following the indicators, for edit.  So, in the following field:
=999  \\$aTest Data

The Edit Field Data tool could interact with the data: “$aTest Data”.  Ideally, this will potentially simplify the process of doing most field edits.  However, it also opens up the opportunity to do recursive group and replacements.

When harvesting data, often times subjects may be concatenated with a delimiter, so for example, a single 650 field may represent multiple subjects, separated by a delimiter of a semicolon.  The new function will allow users to capture data, and recursively create new fields from the groups.  So for example, if I had the following data:
=650  \7$adata – data; data – data; data – data;

And I wanted the output to look like:
=650  \7$adata – data
=650  \7$adata – data
=650  \7$adata – data

I could now use this function to achieve that result.  Using a simple regular expression, I can create a recursively matching group, and then generate new fields using the “/r” parameter.  So, to do this, I would use the following arguments:
Field: 650
Find: (data – data[; ]?)+
Replace: $$a$+/r
Check Use Regular Expressions.


The important part of the above expression is in the Replacement syntax.  To tell MarcEdit that the recursion should result in a new line, the mnemonic /r is used at the end of the string to tell the tool that the recursion should result in a new line.

This new function will be available for use as of 4/7/2014. 


 Posted by at 8:44 pm
Mar 192014

After posting the update last night, a user noted that the Delimited Text Translators 008 generation utilized some older defaults that were causing some validation issues.  So, I’ve posted a limited update to correct this specific problem.

If you have automatic updating turned on, MarcEdit will prompt you automatically.  Otherwise, you can download from:


 Posted by at 9:38 pm
Mar 182014

New MarcEdit has been posted.  Here’s the notes:

  • RDA Helper: Enhancement — Added option to determine if the 040$e is added, in accordance with the PCC Hybrid records documentation.
  • RDA Helper: Enhancement — Updated the GMD generation to take into account additional field data in an attempt to account for accompanying data (rather than alternative forms)
  • CDS-ISIS: Enhancement — Updated the process to account for some additional conversion options.  Updated the field remapping.  I.E., changing from 003=>999, 003,999.
  • CDS-ISIS: Bug Fix — When using the advanced option, appending an initial subfield was being dropped.  Fixed.
  • MarcEditor: Enhancement — Added a new Edit function; Remove Empty Fields/Subfields.  This is used to fix errors in a file, where empty fields or subfields are present.

The update is available either via the automatic update or via URL at:


 Posted by at 7:29 pm
Mar 022014

I’m happy to report that I finally completed work on the latest MarcEdit update.  This change provides updates specifically to the RDA Helper and MarcEdit’s implementation/interaction with the OCLC WorldCat Metadata API.  The full list of changes can be found below. 

  • RDA Helper: Modified the way that the 260/264 conversion occurs, to bring it more in line with the current practice and examples.  This includes retaining the 264$c and creating a second 264\4 when the program can determine that the data defined in the 260$c is a copyright statement.
  • RDA Helper: Added a qualifier option for the 015/020/024/027.  You can turn this on or off (default is off)
  • RDA Helper: I’ve updated some of the abbreviations
  • RDA Helper: Fixed a bug in the 260/264 processing that didn’t always clean up superficial punctuation during the translation process.
  • OCLC Integration: OCLC Search – I’ve added the ability to search by indexes oclc number, issn, isbn as a single or Boolean search.
  • OCLC Integration: In the configuration area, I’ve added some addition error checking.
  • OCLC Integration: In the configuration and batch records update section, I’ve added an option to Lookup your Holdings Codes.  This will show all your institutions valid holdings codes (something that folks have sometimes had trouble finding).  This makes use of the OCLC Metadata API, specifically the Holdings retrieval code.
  • OCLC Integration:  I’ve added the holdings retrieval code to the OCLC API Library.  This code has been synced to github.
  • OCLC Integration: Local Bibliographic Records Support.  This is what I’m still working on.  The program will allow you, on search, to now choose the option to download a local bib record (rather than the OCLC bib record) if one is available.  This allows users who are using OCLC WMS or have Local Bibliographic Records, the ability to create, update, and delete these records. 
  • OCLC Integration: I’ve added the local bibliographic data records code to the OCLC API Library.  This code has been updated to github.


If you have MarcEdit, you can download the program via the automated update tool, or you can download directly from:


 Posted by at 1:28 pm
Feb 272014

Since OCLC made their Metadata APIs available, I’ve spent a good deal of time putting them through their paces and looking for use cases that these particular API might solve, and if/how MarcEdit might be an appropriate platform to take advantage of some of this new functionality.  Looking over the new Metadata API, it’s pretty easy to see where the focus was on – providing some capacity to have read/write access to the WorldCat database, primarily the global and local bibliographic data.  In addition, the API provided the ability to set institutional holdings on records, though not work with Local Holdings Records.

For that reason, the first round of MarcEdit development utilizing these API was focused primarily on working with the master bibliographic and institutional holdings records.  These are two areas where I tend to receive regular queries from users in the mist of reclamation projects or weeding projects and needing to make changes to hundreds or thousands of records in the WorldCat database.  The new APIs allowed me to provide an answer to the two most common questions: 1) Can I batch update/delete holdings on a particular OCLC records and 2) can I batch upload records to OCLC.  While I’d definitely argue (and I think OCLC probably would agree) that these APIs are not designed to be used with batch operations in mind (i.e., they are slow) – they finally offered a way to communicate with the WorldCat database in real-time…and regardless of performance, this feels/is a big win for libraries that choose to work with OCLC.  If you are interested in how this initial work was implemented, you can read about it here: http://blog.reeset.net/archives/1245

After releasing the first MarcEdit builds and making subsequent refinements to the integration work, I’ve had the opportunity to speak with more OCLC WMS users and individuals that make use of the Local Bibliographic Data files and decided that it was time to start working on adding support for this type of data.

The challenge with working with Local Bibliographic Data records is that there is no way (at least through the API) to know if a record has a local bibliographic record attached, without multiple queries and making a set of special calls to the API about a specific OCLC number.  This means that for all intensive purposes, the process of working with local bibliographic data in MarcEdit assumes one of two things:

  1. That the user knows what data has these records
  2. That the user is likely creating new ones.

Secondly, I’ve tried to integrate this new functionality into the existing tools developed for use with the other OCLC Metadata APIs.  So for example, users looking to query a set of records and retrieve the attached local bibliographic records now will see a new option on the OCLC Search box that denotes if the master or local bibliographic record should be extracted.


The problem, is at this point, MarcEdit doesn’t know if a local bibliographic record actually exists.  Certainly, the tool could pre-query every oclc number returned as part of a search, but that approach exponentially increases the back and forth communication between MarcEdit and OCLC via the API, and remember, this isn’t an API that feels designed for batch operations.  So, rather than query, MarcEdit provides an option to download the local bibliographic record if it is present.  If this checkbox isn’t selected, the program will download the master bibliographic record for edit.  For users looking to create a local bibliographic record, the process is the same.  Local bibliographic records must be attached to a master OCLC number – so a user would query a record, check to download the local bibliographic record, and then attempt the download.

When downloading a local bibliographic file, MarcEdit will prompt users, asking if the tool should automatically generate a local bibliographic file for edit in MarcEdit if one doesn’t exist on a master record.


To generate new records, MarcEdit utilizes a new template within the application – local_bib_record_template.mot.  The template contains the following data:

=LDR  00000n   a2200000   4500
=004  [OCLC_NUM]
=940  \\$a[OCLC_CODE]

This template looks slightly different than a normal MarcEdit template, in that it includes a number of data points that MarcEdit will automatically generate as part of the record generation process.  This is necessary because specific data must be present in order for a local bibliographic record to be valid.  For example, a local bibliographic record must have the OCLC number that it’s attached to – data that is found in the 004.  The 935 represents a local system generated number, a number MarcEdit generates as a timestamp indicating record creating time, and finally, the 940 includes the organizational code, or the code that normally would appear in the 040 of a bibliographic record.  This information is stored and used as part of the API profile, so MarcEdit includes that data in the record generation process.

A local bibliographic record is in many ways like the master bibliographic record, just with data only applicable to an institution.  OCLC has a set of documentation around what kinds of data can be stored in these records.  Using a test record in WorldCat, I extracted the following example:


In this record, the data breaks down in the following way:

  • 001 – this is the unique lhd control number
  • 004 – this is the oclc number for the master bibliographic record that this local record is attached to
  • 005 – this is the transaction data; this data must match when updating records as OCLC uses this as a point of validation.
  • 500, 790 – in this record, these are the local data fields…data that is separate from the master bibliographic record and visible only to my users who see our local bibliographic data.
  • 935 – this is a locally defined system number – when MarcEdit generates this number, it is a system timestamp.
  • 940 – this is the institution code.

As one can see, a local bibliographic record can be fairly brief (or verbose depending on the notes added to the record).

To update/create/delete a local bibliographic holdings file (single or batch) – a user would start with a local bibliographic record.  Even delete operations require the record – you cannot just pass the lbd a control number or list of oclc numbers – at least at this point.  The API requires the record to be sent through the service as a type of validation.

Updating this data requires using the Add/Delete/Update Local Bibliographic Data option:


Selecting this option will open the following window:


When dealing with Local Bibliographic data, the option will be selected, and the option to delete those records is also presented.  Users not working with local bibliographic data will see the same dialog, but the Process as Local Bibliographic Records option will be left unchecked and the Delete Records option will not be visible.  At this point, users can process their records, and MarcEdit will return the response codes provided by the API to determine if the updates were successful or not.

My guess is, that like the first pass through the API, the use of these methods and tools that make use of them will be refined with time and use, but I think that they provide a good start.  Of course, at this point, I’ve reached the limit  in terms of functionality that the Metadata API provides.  In looking at this toolset, it’s pretty clear that at this point, these API were primarily envisioned for individual, real-time editing of the WorldCat database.  I have a feeling that the batch holdings tools, and now the ability to upload bibliographic and local bibliographic data in batches probably fall outside of the identified use cases when OCLC first released these to the public.  But they work, though a little slowly, and provide some capacity to work directly with the WorldCat database.  At the same time, the API is limited and missing key features that folks are currently asking for – most notably the ability to work with local holdings data, the ability to validate records, and a better process for search and discovery (as the present Search API are woefully inadequate for nearly any, but for the most vanilla uses).  Hopefully, by doing some simple integration work in MarcEdit and providing some useful tools around the Metadata API, it can provide a catalysis for additional work, additional functionality, and additional innovation on the side of OCLC – and continue to push the cooperative to provide more transparent and deeper access to the WorldCat resources and holdings.

Finally, the functions discussed here will be made available for download on March 2.


 Posted by at 11:13 pm

MarcEdit 5.9 Update

 MarcEdit  Comments Off
Feb 172014

I’ve posted a new update to MarcEdit.  Change list is as follows:

  • Enhancement: Add_Field_Stream API — Added character encoding typing to ensure compatibility with script languages.
  • Enhancement: Delete_Field_Stream API — Added character encoding typing to ensure compatibility with script languages.
  • Enhancement:  Added enhanced error checking code and debugging options to OCLC functions.
  • Enhancement: Added option to use RDA encoded work forms for users using MarcEdit to create new records.

To utilize the new RDA work forms, you need to check a new option in the MarcEdit Preferences.  Open the Preference window, select MarcEditor Preferences, and Check the Use RDA Templates option (see below).  Then click OK.  When you select a new record template from within the MarcEditor, the program will utilize the RDA template for the material type if available.

The update can be downloaded via the automatic updating tool, or at:



 Posted by at 10:19 pm