Dec 192014

Over the past couple of weeks, I’ve been working on expanding the linking services that MarcEdit can work with in order to create identifiers for controlled terms and headings.  One of the services that I’ve been experimenting with is NLM’s beta SPARQL endpoint for MESH headings.  MESH has always been something that is a bit foreign to me.  While I had been a cataloger in my past, my primary area of expertise was with geographic materials (analog and digital), as well as traditional monographic data.  While MESH looks like LCSH, it’s quite different as well.  So, I’ve been spending some time trying to learn a little more about it, while working on a process to consistently query the endpoint to retrieve the identifier for a preferred Term. Its been a process that’s been enlightening, but also one that has led me to think about how I might create a process that could be used beyond this simple use-case, and potentially provide MarcEdit with an RDF engine that could be utilized down the road to make it easier to query, create, and update graphs.

Since MarcEdit is written in .NET, this meant looking to see what components currently exist that provide the type of RDF functionality that I may be needing down the road.  Fortunately, a number of components exist, the one I’m utilizing in MarcEdit is dotnetrdf (  The component provides a robust set of functionality that supports everything I want to do now, and should want to do later.

With a tool kit found, I spent some time integrating it into MarcEdit, which is never a small task.  However, the outcome will be a couple of new features to start testing out the toolkit and start providing users with the ability to become more familiar with a key semantic web technology,  SPARQL.  The first new feature will be the integration of MESH as a known vocabulary that will now be queried and controlled when run through the linked data tool.  The second new feature is a SPARQL Browser.  The idea here is to give folks a tool to explore SPARQL endpoints and retrieve the data in different formats.  The proof of concept supports XML, RDFXML, HTML. CSV, Turtle, NTriple, and JSON as output formats.  This means that users can query any SPARQL endpoint and retrieve data back.  In the current proof of concept, I haven’t added the ability to save the output – but I likely will prior to releasing the Christmas MarcEdit update.

Proof of Concept

While this is still somewhat conceptual, the current SPARQL Browser looks like the following:


At present, the Browser assumes that data resides at a remote endpoint, but I’ll likely include the ability to load local RDF, JSON, or Turtle data and provide the ability to query that data as a local endpoint.  Anyway, right now, the Browser takes a URL to the SPARQL Endpoint, and then the query.  The user can then select the format that the result set should be outputted.

Using NLM as an example, say a user wanted to query for the specific term: Congenital Abnormalities – utilizing the current proof of concept, the user would enter the following data:

SPARQL Endpoint:


PREFIX rdf: <>
PREFIX rdfs: <>
PREFIX xsd: <>
PREFIX owl: <>
PREFIX meshv: <>
PREFIX mesh: <>

SELECT distinct ?d ?dLabel 
  ?d meshv:preferredConcept ?q .
  ?q rdfs:label 'Congenital Abnormalities' . 
  ?d rdfs:label ?dLabel . 
ORDER BY ?dLabel 

Running this query within the SPARQL Browser produces a resultset that is formatted internally into a Graph for output purposes.




The images snapshot a couple of the different output formats.  For example, the full JSON output is the following:

  "head": {
    "vars": [
  "results": {
    "bindings": [
        "d": {
          "type": "uri",
          "value": ""
        "dLabel": {
          "type": "literal",
          "value": "Congenital Abnormalities"

The idea behind creating this as a general purpose tool, is that in theory, this should work for any SPARQL endpoint.   For example, the Project Gutenberg Metadata endpoint.  The same type of exploration can be done, utilizing the Browser.


Future Work

At this point, the SPARQL Browser represents a proof of concept tool, but one that I will make available as part of the MARCNext research toolset:


As part of the next update.  Going forward, I will likely refine the Browser based on feedback, but more importantly, start looking at how the new RDF toolkit might allow for the development of dynamic form generation for editing RDF/BibFrame data…at least somewhere down the road.


[1] SPARQL (W3C):
[2] SPARQL (Wikipedia):
[3] SPARQL Endpoints:
[4] MarcEdit:
[5] MARCNext:

Aug 162013

**A number of members of the MarcEdit community provided feedback while working on these changes.  Specifically, Heidi Frank (NYU) and Jim Taylor (of for contributing their time and some artistic skill in creating some of the new functional icons.

In addition to a handful of bug fixes and enhancements, one of the big changes coming to the next MarcEdit update will be around UI changes.  I’ve been taking some time and collapsing menus to try and shorten them a bit (they are getting long) and refreshing a few of the screens and tools.  The first screen to be refreshed and includes some significant enhancements is the start screen.

Current Start Screen

The MarcEdit start screen has been largely unchanged for close to 5 years.  The start screen has included a start screen that includes access to a handful of tools and utilities that I have heard are fairly commonly used. 

Current MarcEdit Start Screen

Over the years, I periodically change the tools and utilities available on the start page, but by and large, it has stayed largely static. 

Updated Start Screen

The next update will reflect a shift in the start screen design.  First, the page will move from a more textual/information screen, to one that is more reliant on both graphics and text to help users find the right tool.  Secondly, the start screen will be customizable.  The screen will provide the ability for users to define what tools that they want to have quick access too. 

Updated Default Screen

The Updated Start Screen will include larger images, with text – to help users quickly locate the tool that they are looking for on the start screen.  However, unlink past versions, users can change the tools available from this screen.  By clicking on the lower right hand configurations icon, or selecting Preferences from the Tools menu, users will be presented with the following new configuration option:

New Configuration Options

The next configuration options pull out the 12 most commonly used tools/add-ins.  Users can select up to 4 of these items and place them on their start screen.  By selecting new items, and clicking OK, the user will find that their application lay out changes:

User Configured Interface

Here, I changed the default options to selected the Delimited Text Translator, the Merge Records Tool, the RDA Helper, and the Call Number Generator.  These will now be available to me on the front screen whenever I open MarcEdit.  And since these configuration changes are linked to a user’s profile, multiple users, on the same computer, could have different Start Screens depending on how the utilize the program.

Selecting 3 items

As noted above, you can select up to 4 user tools to display on the front page.  But users have the option to select as few options as they want as well.  In this example, I removed an option and only selected the most common 3 tools for the Start Screen.



These UI changes are the first of what will be a handful of changes that I’ll be making to the tool over the next couple of months as I refresh the interface, clean up some old code and look to improve some of the workflows in the application.  I’ll be posting wireframes through the MarcEdit listserv when I’m planning major changes, so if you are interested in having a voice on upcoming changes, keep and eye open on the MarcEdit list.


Nov 222012

One of the often requested enhancements to the MarcEdit Delimited Text Translator is the ability to auto generate the arguments list.  For many users, their spreadsheets or delimited text documents include a line at the beginning of the document defining the data found in the file.  I’ve often had folks wonder if I could do anything with that data to help auto generate the arguments list used by MarcEdit to translate the data. 

Well, in anticipation of Thanksgiving, I finished working on what will be the next MarcEdit update.  I won’t post it till the weekend, but this new version will include and Arguments Auto Generation button that will allow MarcEdit to capture the first line of a data file and if properly formatted, auto configure the Arguments List. 


The format supported by the Auto Generation feature is pretty straightforward.  It essentially is the following:  Field$Subfield[ind1ind2punct].

Let me break down the format definition:

  • Field – represents the field to be mapped to, i.e.: 245.  This is a required value.
  • $Subfield – represents the subfield to be mapped., i.e.: $a.  This is a required value.
  • ind1 – represents the first indicator.  This is an optional value, but if defined, indicator 2 must be defined.
  • ind2 – represents the second indicator.  This is an optional value, but if indicator 1 is defined, indicator 2 must also be defined.
  • punct – represents the trailing punctuation of the field.  This is an optional value.  However, if you wish to define the punctuation, you must define the indicator 1 and indicator 2 values as well.

Some examples of the syntax:

  • 245$a  — no indicators are defined, the default indicators, 2 blanks, will be used.
  • 245$a10 – defines the field, subfield and indicators 1 and 2.
  • 245$a10. – defines the field, subfield, indicators 1 and 2, and defines a period as the trailing punctuation.

In MarcEdit, you can join fields together.  This allows users to join data in multiple columns into a single subfield.  In MarcEdit, joined fields are represented by an asterisk “*”.  If I wanted to join two or more fields, I can add an asterisk group to the field.  For example:

  • Field0:  *100$a10,
  • Field1:  *100$a10.

MarcEdit will interpret field 0 and field 1 as being joined fields because the asterisk marks them as joined.

I’ve placed a video on YouTube to demonstrate the upcoming functionality.  You can find out more about it here:

If you have questions about this new function or suggestions, let me know.


MarcEdit 5.2 Update

 MarcEdit  Comments Off
Mar 082010

Hi all,

I have just uploaded a new version of MarcEdit to the website.  This version specifically addresses two bugs and introduces the Task Automation function into the application. 

Bug Fixes:

  • Changes not saved in the MarcEditor:
    Under certain conditions, MarcEdit would lose track of changes made within the program.  This occurred primarily after changes had been made and then the Find All Function was used.  This has been corrected as of this version.
  • Save As…file not found error:
    When using Save As to Save data, MarcEdit would throw an error if the file being saved did not previously exist.  This has been corrected as of this version.
  • Invalid prompts to save a file before closing:
    MarcEdit tended to error on the side of always asking users to save their data before closing.   However, even if a user had saved their data, the message may still have popped up.  This has been corrected as of this version.
  • Find/Replace – replacing without using Match Case:
    While it is always recommended when doing global replacements to do them using the Match Case option – prior to this version, unchecking the match case option would cause the replacement to take too many characters.  This has been corrected as of this version.


  • Task Automation tool:
    The task automation tool is a recorder that allows users chain together replacement functions.  Sadly, I haven’t been able to add information to the help file, however, I did record the following video tutorial to provide some initial background to get started using the function.  One quick note – this is a new tool so when using it, please verify changes.  Moreover, feedback is definitely appreciated.
    Video Tutorial:

You can download the updated version of MarcEdit at: MarcEdit_Setup.msi or the Linux/Mac/Other version build at:


Aug 042009

I was asked the other day at AALL (American Association of Law Libraries) if MarcEdit could be used to move specific data from one field and replace the data currently present in another.  So, an example – the ability to move data from a 260$c to the 008 position 7:4.  You can actually, though its sadly not documented (one of those few hidden gems that have been created either for specific projects I or others have been working on).  So how do we do it.

Open the Edit Subfield Function.  In the Edit subfield function, there is an option called Move Subfield data.  That’s that one we want to check.   Then, we enter the following (using the 260$c-008).

Field: 260 [Enter the field with the data that you wish to move]
Subfield: c
Find: [leave blank – though you can enter data here if you want to find something specific to move]
Replace: 008|7||

Ok, the Replace looks funny and it is.  There are essentially a handful of options you can set here (4) – I’m going to explain two for now (and will update this post when I update the official documentation). 

Each pipe “|? represents a delimiter.  The first two pipes are the most important:

  1. Field to move to
  2. Where to move (replace) the data

In the above example, we are moving data to the 008 and placing the data in position 7.  If I was placing the data into a subfield, I would have entered a subfield (example: c) here.  So, the edit form would look like the following:




Aug 012009

I need to send this out to the 15 or so folks that have agreed to be my first guinea pigs testing out a MarcEdit build on ‘Nix and Mac systems (and btw, Mac UI rendering isn’t good.  That’s not surprising because Mono’s UI changes tend to show up correctly on ‘Nix first, then Mac – so I’m hopefully the planned 2.6 update in Sept. will correct many of the errors) but I wanted to document it here as well so I don’t forget to add it the installation instructions later.

In Windows, MarcEdit includes the yaz install as part of the application installation.  This means that when people install MarcEdit, all the dependencies that they need are installed as well.  With Linux, that won’t be the case.  On Linux, you will need to make sure that you install the yaz and yaz-devel packages.  Once installed, you need to make one more change (and here’s the trick).

In MarcEdit windows, the yaz dll has been marked, renamed as yaz3.dll.  The reason for this is that I don’t want to be accidently over-writing a previous installation of the software (in case other programs on the system are relying on older or newer versions of the library).  This works fine in Windows, but on Linux, the problem is that the yaz components are installed as yaz (not yaz2, yaz3, etc).  So, in Mono, the way that the framework makes calls to native libraries through PInvoke is to look for the linked file and then start checking the following locations for the following file names (using yaz3.dll as the example):

  • Application Path/yaz3.dll
  • Application Path/
  • Application Path/
  • Application Path/lib/yaz3.dll
  • Application Path/lib/
  • Application Path/lib/
  • System/lib/yaz3.dll
  • System/lib/
  • System/lib/

The problem is a simple one – when yaz is built either by source or package manager, you end up with a shared object called:  So, the simple solution is to setup a symlink from System/lib/ to System/lib/  So, on my Ubuntu install, that would be creating a symlink in the following path (/usr/lib/) using the following command:

  • ln

And that’s it.  Once I made that change, the Z39.50 client started working as expected, and now this information has been documented so I can make sure it makes it into the INSTALL.txt file.

Cheers everyone,


MarcEdit 5.1 update

 MarcEdit  Comments Off
Mar 222009

Couple of quick changes. 

  1. Delete Fields – you can delete multiple fields using “x? syntax, where the ‘x’ is no longer case sensitive.  For example:
    a) 900 – deletes the 900 field
    b) 90x – deletes fields 900, 901, 902, 903, etc.
    c) 9xX – deletes fields 900, 901, 902, 903…910, 911…920, 921, etc.
    Originally, x’s had be be lower case.  I’ve modified the code so that this is no longer case sensitive.
  2. Edit Subfield Function:  When working with Control fields (00x fields) – MarcEdit use to require data in the Position element.  If the position element is empty, the program will do a usual find/replace – including prepending and concating data.
  3. MarcEdit Preview Function (Bug Fix):  If a user loads a MARC file directly into the MarcEditor – the program would lose the temp file – so if you clicked on the link to load the entire file into the MarcEdit, it would get lost.  This only occurred when users opened a .mrc (MARC) file directly into the MarcEditor.  In all other contexts, the preview link works as documented.

If either of these affect your workflow, you can download the update here: MarcEdit_Setup.msi


MarcEdit Update

 MarcEdit  Comments Off
Mar 162009

I just uploaded a new version of MarcEdit 5.1.  Most of the updates are related to changes in the MarcEditor.  So here’s the list of changes:

  1. MarcEdit Find (specifically when the regular expression option is selected) – Previously, when searches were done, items were located but the window didn’t scroll to the located item.  That’s been corrected.
  2. MarcEdit Replace All (Regular Expressions): One of the changes made during the last update of MarcEdit was to change MarcEdit’s MarcEditor’s replace all function (when using regular expressions) from a single line evaluation to evaluating whole records.  This allows for the ability to perform replacement actions by evaluating multiple fields – but I had neglected to consider how this might break current workflows that relied on the previous functionality.  So, I’ve returned the functionality to evaluating single lines and added a switch to allow users that want to process data across multiple lines.  So for example:
    (=6.*)([^.]$) – this would evaluate field by field (line by line)
    (=6.*)([^.]$)/m – the /m tells marcedit to evaluate multiple lines – so this would run this expression against the entire record.
  3. MarcEdit Replace All (Regular Expressions):  The expression evaluator was too greedy – causing matches to blank records.  This should never happen any longer.
  4. MarcEditor – when openning a blank .mrk file using the Open button – the window would hang.  That’s been corrected.
  5. MarcEditor – when opening a .mrk file by double clicking on it, then opening a new window – closing either window would close both windows.  This has been corrected.
  6. XSLT file updates – I’ve added Creative Commons Zero license headers to the FGDC stylesheets distributed in MarcEdit.
  7. Help File has been updated to cover some of the noted changes.

Couple of notes – I’m currently writing up some new notes on using MarcEdit on Linux.  Mono 2.0+ essentially has added all the functionality necessary to run MarcEdit on Linux.  I’ll be creating a handful of Youtube videos for folks interested in giving this a try.  As for running on a Mac – well, I’ll look at that next. 

You can download the new version of MarcEdit from here: MarcEdit_Setup.msi


Mar 032009

Just an FYI, since some people ask how we go about generating MARC records from our EAD records using MarcEdit…I’ve posted a short video tutorial.  What I didn’t include was the EAD translation (it’s somewhat specific to OSU), but I’m happy to add a link to it if anyone is interested.

Anyway, you can find the video here:

And just for reference, I think going forward, as I create tutorials or clarify information in the documentation – I’ll likely upload a video to youtube and will use MarcEdit as a common tag.


Mar 022009

Getting this last update out has taken a little more time than I would have liked, but I really wanted to think some of the issues that this update raised through so that the update process would be seamless.  Realistically, were I versioning MarcEdit in any realistic versioning process, this would likely be at the very least, a new point release.  However, I’ve already planned out my 5.5 release, and this request, while major, fell into this gray area – so I decided to keep this into the 5.1 branch.  Anyway, March 2nd is a good day to officially make this release.  March 2nd is my 32nd birthday, and this version of MarcEdit will be present to the MarcEdit user community.  Cheers.

Major Changes:

  1. In the continued work towards helping enterprise users, I’ve finally finished the installer update process that began with the last update.  In the previous update, MarcEdit’s installer was modified (as was MarcEdit itself) to make it more user aware.  What do I mean here – the program was changed so that users running MarcEdit no longer needed to be administrators when running the program, but rather, all configuration and mutable files were moved into the users Application Data directories.  This had a number of unintended benefits like supporting multiple MarcEdit users on the same machines (using custom user profiles) and making it easier to copy configuration settings from one computer to another. 

    This update takes this one step further.  As Libraries continue to move to more sophisticated application management, I’ve been running into more users that have their software managed through a central IT source.  The IT groups manage the software by automating a process to do distributed installation.  In the past, MarcEdit’s installer really didn’t do this well.  Well, about 2 months ago, I was contacted by a large IT group on the west coast wanting to know if this would be possible.  This required finishing the migration from MarcEdit’s custom installer to the Microsoft Windows Installer – while at the same time, making sure that the program cleans up the previous install while still keeping the users previous settings throughout the upgrade.  After a lot of testing (over the last month, spanning multiple institutions and users in different development environments) – I feel like this is ready to go. 

    When you install the new msi installer, what will happen. 
    1.  MarcEdit will evaluate your current installation – if you have never installed MarcEdit, it simply installs the application
    2.  If a previous version of MarcEdit is present, the program will copy, in order: config. data in the User Application directory, data in the MarcEdit Program directory – and then silently uninstall the previous version of MarcEdit.  Once the previous version has been removed, the installer when then install the new version of the Application. 
    3.  Part of the clean up process of the new install is to move the copied user data back into scope of the application.

    So what does this mean to you?  Well, if you are an individual user, who manages MarcEdit on your own machine, very little.  The one benefit that you will likely see from this migration is the eventual development of an automated updater.  The msi installer provides a number of very powerful and integrated functions that I should be able to leverage to potentially create an unmediated upgrade process for users.  For enterprise users however, this change will for the first time give your IT administrators the ability to install MarcEdit on multiple machines simply by using their enterprise software management system.  For colleges and universities that manage thousands of users, this should be a really big win.

  2. Z39.50 Changes:  This will be an incremental process, but for the first time, MarcEdit will allow users to query multiple user databases during a Z39.50 lookup.  This will allow users to query multiple Z39.50 targets to return data about a search.  This initial implementation allows multiple searching to be done in the Single Search mode.  In a planned future update, this will be extended to the batch search update, with rules regarding how to disambiguous duplicate records (for example, the ability to accept records from one target over another, etc.).  So how does this work.  Essentially, when you enter the single search, you select Select database and then select multiple items.  Up front, this is limited to 3 databases, but that limit will eventually be removed (especially as I get UI feedback).  When you select multiple database, the Single search screen changes to look like the below…

    Do you see the data in the red box?  This is how you can see what resources MarcEdit will be querying.  Also, see the green box.  You can see here that MarcEdit’s Z39.50 results list has changed slightly to let users see what institution each record is from.

  3. Help File is now local again:  Sadly, some topics are already out of date slightly (the Z39.50 info for example doesn’t represent the multiple querying functionality) – but this makes the help available both online and offline.  The online help will always be more up todate, but will be updated on each build.
  4. Youtube tutorials.  If you go to Youtube and look for marcedit, you will find a series of tutorials related to MarcEdit topics.  At present, you will find topics for:
       1.  Breaking your file
  5.    2.  Making your file
  6.    3.  Editing a MARC file
  7.    4.  Converting a files characterset
  8.    5.  Adding a new XML Function
  9.    6.  Updating a current XML Function
       7.  Using the Delimited Text Wizard
  10.    8.  Extracting a subset of records from a larger set
  11.    9.  Using the Z39.50 Client
  12. 10.  Harvesting OAI data into MARC
  13. 11.  Managing Plug-ins through the plug-in editor.
  14. OCLC Connexion Client Plugin has been updated (was needed do to updates in a few other components).  If you use the Connexion Client plugin, you will need to update this plugin once you update. 
  15. MarcEditor Editor update to better support UTF-8 data loading.  Essentially, in layman’s terms, here’s what has changed.  In previous versions of MarcEdit, loading UTF8 data into the MarcEditor would sometimes cause the process to load slowly.  The why this occurred had to do with the way that the specific windows component that I was using handled text.  I’ve updated MarcEdit so that this process has been changed (as has the component) making it so that data loaded into the MarcEditor now uses a new editing component and one that natively handles UTF8 data.  The lag time that users previously experienced should no longer be applicable.  In addition to this fix, MarcEdit’s MarcEditor’s memory footprint has been reduced.  Not drastically, but a bit.  One thing to remember when loading data into the MarcEditor.  There is roughly a 4-1 memory usage when loading bytes into a visual interface in Windows.  So, for example, load a 20 MB file, and Windows will allocate ~80 MB of memory to view the file.  Open a 120 MB file, and Windows will need to allocate ~480 MBs to render the file.  The new Editor is able to reclaim some of this memory on the high end, but this is in part, how visual interfaced work. 

    Also, note the 1/2 GB limit of data loading into the Editor, but that data of any size can continue to be edited in the editor if one makes use of the Preview mode.

  16. Yaz Update – Previous version of MarcEdit used Yaz 1.+ because it was small and fast.  I had need for some of the enhanced functionality, so I’ve updated the version of used Yaz to 3.+.
  17. ‡ Proof of concept Plug-in:  As noted in a recent post, while attending code4lib this week, the folks at Liblime demonstrated their new  ‡ platform.    For those that haven’t heard, the  ‡ platform is an attempt to create a large, shared, Open Data repository of bibliographic metadata.  What I find most interesting about LibLime’s effort has been the development of an open API to provide push/pull functionality into the database.  In theory, this allows library developers the ability to develop tools around the  ‡ platform.  The plug-in demonstrates how this works, as well as providing folks that want to work with the  ‡ platform a way to integrate their workflow with MarcEdit.  You can see the Youtube video talking about how it works, here:

Minor Changes:

  1. UI changes to the Z39.50 (to accommodate the changes in functionality)
  2. Extended regular expression support in the Replace function so that regular expressions can be run over multiple lines.
  3. Updated workflow – when converting data from UTF-8 to MARC8 using the MarcBreaker, the 9th byte in the leader isn’t set correctly.  This is partly because the previous workflow assumed moving the other direction.  Since this cause some problems with some loaders, its been corrected.
  4. Updated the Marc21XML xslt function to accommodate the following:
         a.  Up to 9 indicators (per UniMARC)
         b.  Ability for indicators to be mixed.  The current version assumes indicator order, the update allows indicators to appear in whatever order.
  5. Other minor changes

You can pick up the update at: MarcEdit_Setup.msi.

If you run into any problems, please give me a holler.