Apr 112015
 

This all started with a conversation over twitter (https://twitter.com/_whitni/status/583603374320410626) about a week ago.  A discussion about why the current version of MarcEdit is so fragile when being run on a Mac.  The short answer has been that MarcEdit utilizes a cross platform toolset when building the UI which works well on Linux and Windows systems, but tends to be less refined on Mac systems.  I’ve known this for a while, but to really do it right, I’d need to develop a version of MarcEdit that uses native Mac APIs, which would mean building a new version of MarcEdit for the Mac (at least, the UI components).  And I’ve considered it – mapped out a road map – but what’s constantly stopped me has been a lack of interest from the MarcEdit community and a lack of a Mac system.  On the community-side, I can count on two hands the number of times I’ve had someone request a version of MarcEdit  specifically for a Mac.  And since I’ve been making a Mac App version of MarcEdit available – it’s use has been fairly low (though this could be due to the struggles noted above).  With an active community of over 20,000, I try to put my time where it will make the most impact, and up until a week ago, better support for Mac systems didn’t seem to be high on the list.  The second reason is I don’t own a Mac.  My technology stack is made up of about a dozen Windows and Linux systems embedded around my house because they play surprisingly well together, where as, Apple’s walled garden just doesn’t thrive within my ecosystem.  So, I’ve been waiting and hoping that the cross-platform toolset would get better and that in time, this problem would eventually go away.

I’m giving that background because apparently I’ve been misreading the MarcEdit community.  As I said, this all started with this conversation on twitter (https://twitter.com/_whitni/status/583603374320410626) – and out of that, two co-conspirators, Whitni Watkins and Francis Kayiwa set out to see just how much interest there actually was in having dedicated version of MarcEdit for the Mac.  The two set out to see if they could raise funds to acquire a Mac to do this development and indirectly, demonstrate that there was actually a much larger slice of the community interested in seeing this work done.  And, so, off they went – and I set back and watched.  I made a conscious decision that if this was going to happen, it was going to be because the community wanted it and in that, my voice wasn’t necessary.  And after 8 days, it’s done.  In all, 40 individuals contributed to the campaign, but more importantly to me, I heard directly from around 200+ individuals that were hopeful that this project would proceed. 

Development Roadmap

Now the hard work starts.  MarcEdit is a program that has been under constant development since 1999 – so even just rewriting the UI components of the application will be a significant undertaking.  So, I’m breaking up this work in chunks.  I figure it would take approximately 8-12 months to completely port the UI, which is a long-time.  Too long…so I’m breaking the development into 3 month “sprints”.  the first sprint will target the 80%, the functionality that would make MarcEdit productive when doing MARC editing.  This means porting the functionality for all the resources found in the MARC Tools and much of the functionality found in the MarcEditor components.  My guess is these two components are the most important functional areas for catalogers – so finishing those would allow the tool to be immediately useful for doing production cataloging and editing.  After that – I’ll be able to evaluate the remainder of the program and begin working on functional parity between all versions of the application. 

But I’ll admit, at this point, the road map is somewhat even cloudy to me.  See, I’ve written up the following document (http://1drv.ms/1ake4gO) and shared it with Whitni and asked her to work with other Mac users to refine the list and let me know what falls into that 80%.  So, I’ll be interested to see where their list differs from my own.  In the mean time, I’ll be starting work on the port – creating wireframes and spending time over the next week hitting the books and familiarizing myself with Apple’s API docs and the UI best practices (though, I will be trying to keep the program looking very familiar to the current application – best practices be damned).  Coding on the new UI will start in earnest around May 1 – and by August 1, 2015, I hope to have the first version built specifically for a Mac available.  For those interested in following the development process – I’ll be creating a build page on the MarcEdit website (http://marcedit.reeset.net) and will be posting regular builds as new areas of the application are ported so that folks can try them, and give feedback. 

So, that’s where this stands and this point.  For those interested in providing feedback, feel free to contact me directly at reeset@gmail.com.  And for those of you that reached out or participated in the campaign to make this happen, my sincere thanks. 

–TR

Mar 312015
 

The MarcEdit 101 Webinar Series were created over the course of multiple months for the CARLI (http://www.carli.illinois.edu/) consortium in Spring 2015.  In late March 2015, CARLI reached out to me and requested that these webinars be made available to the larger MarcEdit community, so if you find these webinars useful, please reach out and thank the folks at CARLI.

Couple of notes, these webinars are being made available as is, save for the following modifications:

  1. Attendee names have been anonymized.  While I’m certain most attendees would have no problem with their names showing up in these webinar lists, the original intended audience was locally scoped to CARLI and it’s members.  Masking attendees was done primarily because of this change of scope.
  2. The Q/A at the end of the sessions has generally been removed from the webinars.  Again, these are localized webinars and questions asked during the webinars tend to be within the scope of this consortia.

I’ll be making these video available over the next couple of months.  Again, if you find these webinars useful, please make sure you let the folks at CARLI know.

Series URL: http://marcedit.reeset.net/marcedit-101-workshop

–TR

Mar 172015
 

List of changes below:

** Bug Fix: Delimited Text Translator: Constant data, when used on a field that doesn’t exist, is not applied.  This has been corrected.
** Bug Fix: Swap Field Function: Swapping control field data (fields below 010) using position + length syntax (example 35:3 to take 3 bytes, starting at position 35) not functioning.  This has been corrected.
** Enhancement: RDA Helper: Checked options are now remembered.
** Bug Fix: RDA Helper: Abbreviation Expansion timing was moved in the last update.  Moved back to ensure expansion happens prior to data being converted.
** Enhancement: Validator error message refinement.
** Enhancement: RDA Helper Abbreviation Mapping table was updated.
** Enhancement: MarcEditor Print Records Per Page — program will print one bib record (or bib record + holdings records + authority records) per page.
** Bug Fix: Preferences Window: If MarcEdit attempts to process a font that isn’t compatible, then an error may be thrown.  A new error trap has been added to prevent this error.

You can get the new download either through MarcEdit’s automatic update tool, or by downloading the program directly from: http://marcedit.reeset.net/downloads

–tr

 Posted by at 7:20 pm
Mar 112015
 

A question that comes up occasionally is the need to be able to conditionally add or replace a set of character data within a MARC Field.  For example, consider this use case:

I’d like to add a period to the end of a field (like say, the 650 field), but only under the following conditions:

  1. The field ends with a word (a-z) character.
  2. The field doesn’t already end in a period or parenthesis
  3. If the field ends with any other punctuation, that value is replaced with a period.

Doing option 1 and 2 is easy and straightforward.  For that option, I’d probably do something like this:

Find: (=650.*[^.;])$
Replace With: $1.

This allows MarcEdit to match any line that doesn’t end in a period or a parenthesis.  However, the conditional makes this more difficult.  In C#’s implementation of regular expressions, you can use substitutions and conditional matching to achieve the above result.  Consider the following data:

=650  \6$aMusique populaire$zQuébec (Province)$y1951-1960.
=650  \6$aMusique populaire$zQuébec (Province)$y1961-1970,
=650  \6$aMusique populaire$zQuébec (Province)$y1961-1970)
=650  \6$aMusique populaire$zQuébec (Province)$y1961-1970;
=650  \6$aMusique populaire$zQuébec (Province)$y1961-1970
=650  \6$aMusique populaire$zQuébec

Using the above criteria, I’d like to be able to run a process that will turn the comma in line two, into a period, the semi-colon in line 4 into a period and add a period to the end of line 5 and 6.  To do this, you’d setup a substitution. 

Find: ((?<one>=650.*[\w])|(?<one>=650.*)(?<two>[^.)]))$
Replace With: ${one}.

So what exactly is happening here.  In the .NET regular expressions, you can use named substitutions to represent groups.  In this case, we create a conditional using an ‘or’ clause, using the same substitution name for each element of the clause.  We then push out the replacement clause and give it a separate grouping.  Now, we have isolated the data we want to keep, and can use the same statement to get all the data we want to keep/append to.  Using the above, you will receive the following output:

=650  \6$aMusique populaire$zQuébec (Province)$y1951-1960.
=650  \6$aMusique populaire$zQuébec (Province)$y1961-1970.
=650  \6$aMusique populaire$zQuébec (Province)$y1961-1970)
=650  \6$aMusique populaire$zQuébec (Province)$y1961-1970.
=650  \6$aMusique populaire$zQuébec (Province)$y1961-1970.
=650  \6$aMusique populaire$zQuébec.

Obviously, the above is a fairly simple example – but the concept should can be applied to much more complicated workflows.  If you are interested in reading more about the Regular Expression implementation used in MarcEdit, please see: https://msdn.microsoft.com/en-us/library/vstudio/az24scfc(v=vs.100).aspx.

Questions, let me know.

–tr

 Posted by at 9:19 am
Feb 232015
 

A new version of MarcEdit has been made available.  The update includes the following changes:

  • Bug Fix: Export Tab Delimited Records: When working with control data, if a position is requested that doesn’t exist, the process crashes.  This behavior has been changed so that a missing position results in a blank delimited field (as is the case if a field or field/subfield isn’t present.
  • Bug Fix: Task List — Corrected a couple reported issues related to display and editing of tasks.
  • Enhancement: RDA Helper — Abbreviations have been updated so that users can select the fields that abbreviation expansion occurs.
  • Enhancement: Linked Data Tool — I’ve vastly improved the process by which items are linked. 
  • Enhancement: Improved VIAF Linking — thanks to Ralp LeVan for pointing me in the right direction to get more precise matching.
  • Enhancement: Linked Data Tool — I’ve added the ability to select the index from VIAF to link to.  By default, LC (NACO) is selected.
  • Enhancement: Task Lists — Added the Linked Data Tool to the Task Lists
  • Enhancement: MarcEditor — Added the Linked Data Tool as a new function.
  • mprovements: Validate ISBNs — Added some performance enhancements and finished working on some code that should make it easier to begin checking remote services to see if an ISBN is not just valid (structurally) but actually assigned.
  • Enhancement: Linked Data Component — I’ve separated out the linked data logic into a new MarcEdit component.  This is being done so that I can work on exposing the API for anyone interested in using it.
  • Informational: Current version of MarcEdit has been tested against MONO 3.12.0 for Linux and Mac.

Linked Data Tool Improvements:

A couple specific notes of interest around the linked data tool.  First, over the past few weeks, I’ve been collecting instances where id.loc.gov and viaf have been providing back results that were not optimal.  On the VIAF side, some of that was related to the indexes being queried, some of it relates to how queries are made and executed.  I’ve done a fair bit of work added some additional data checks to ensure that links occur correctly.  At the same time, there is one known issue that I wasn’t able to correct while working with id.loc.gov, and that is around deprecated headings.  id.loc.gov currently provides no information within any metadata provided through the service that relates a deprecated item to the current preferred heading.  This is something I’m waiting for LC to correct.

To improve the Linked Data Tool, I’ve added the ability to query by specific index.  By default, the tool will default to LC (NACO), but users can select from a wide range of vocabularies (including, querying all the vocabularies at once).  The new screen for the Linked Data tool looks like the following:

image

In addition to the changes to the Linked Data Tool – I’ve also integrated the Linked Data Tool with the MarcEditor:

image

And within the Task Manager:

image

The idea behind these improvements is to allow users the ability to integrate data linking into normal cataloging workflows – or at least start testing how these changes might impact local workflows.

Downloads:

You can download the current version buy utilizing MarcEdit’s automatic update within the Help menu, or by going to: http://marcedit.reeset.net/downloads.html and downloading the current version.

–tr

 Posted by at 9:38 pm
Feb 022015
 

This MarcEdit update includes a couple fixes and an enhancement to one of the new validation components.  Updates include:

** Bug Fix: Task Manager: When selecting the Edit Subfield function, once the delete subfield checkbox is selected and saved, you cannot reopen the task to edit.  This has been corrected.
** Bug Fix: Validate ISBNS: When processing ISBNs, validation appears to be working incorrectly.  This has been corrected.  The ISBN validator now automatically validates $a and $z of any field specified.
** Enhancement: Validate ISBNs: When selecting the field to validate — if just the field is entered, the program automatically examines the $a and $z.  However, you can specify a specific field and subfield for validation. 

 

Validate ISBNs

This is a new function (as of the last update) that utilizes the mathematical formula to examine the ISBN and determine if the number is mathematically correct.  As I work into the future, I’ll add functionality to enable users to ensure that the ISBN is actually in use and linked to the record referenced in the record.  To use the function, open the MarcEditor, Select the Reports Menu, and then Validate ISBNs. 

image

Once selected, you will be asked to specify a field or field and subfield to process.  If just the field is selected, the program will automatically evaluate the $a and $z if present.  If the field and subfield is specified, the program will only evaluate the specified subfield.

image

When run, the program will output any ISBN fields that cannot be mathematically validated.

image

 

To get the update, utilize the automated update utility or go to http://marcedit.reeset.net/downloads to get the current download.

–tr

 Posted by at 10:04 pm
Dec 242014
 

Happy Christmas! It is my sincere wish that everyone reading this is/has had a wonderful time with family and good friends over the holiday season. This year marks the second year that I’ve been away from my family – my parents, my brothers, my in-laws – all still in Oregon. It’s the hardest time to be away from family; we have always been close and while my wife and kids have our own Christmas traditions, we’ve always found time around the holidays to be together as an extended family. But this second year in Ohio has been very different than the first. Last year, we were still trying to settle into our new community, new friends and absorb the Midwest culture (which is very different than the west coast). Ohio had become home, and yet, it wasn’t.

This year has been different. We’ve made good friends, our kids have found a place to fit in; we’ve bought a house and are putting down roots. Ohio State University continues to be a place with challenges and opportunities to learn and grow – but more importantly, it has become a place not just with colleagues that I respect and continue to learn from, but a place where friendships have been made. When my older son had a bit of a health scare, it was my community at Ohio State and the friends we’ve made in our neighborhood that helped to provide immediate support, and continue to support us. As I look back on 2014 and all the wonderful friends and adventures that we’ve had in our new adopted state, I realize just how fortunate and blessed my family has been to find a community, job, and friends that have just fit.

This last year has also saw the continued growth of both MarcEdit and its user community. On the application side, this year saw the release of MarcEdit 6, the MARCNext tool kit, integration with OCLC’s WorldCat, new language tools, automation tools, etc.  The user community…well, I’m consistently amazed by the large and diverse user community that has grown up around something that I really made available with the hope that maybe just one other person might find it useful. This is a great community, and I’m always humbled by the kindness and helpfulness displayed. I’m told often how much people appreciate this work. Well, I appreciate you as well. I have always appreciated the opportunity to work with so many interesting people on projects and problems that potentially can have lasting impacts. It has, and always will be one my great pleasures.

On to the update….in what has become a tradition, I’m releasing the MarcEdit Christmas update. I’d already provided a little bit of information related to what was changing in a previous blog post: http://blog.reeset.net/archives/1632 – but I’m including the full list below:

 Changes:

  • Enhancement: MARCCompare: Added options to allow users to define colors for added and deleted content.
  • Enhancement: MARCCompare: Added options to support automatic sorting of data prior to comparison. Users can define the field for sorting (default is the 001)
  • Enhancement: MARCEngine: Improved support for automated conversion of NRC notation in UTF8 data to ensure proper representation of UTF8 characters.
  • Modified Behavior: Automated Update: Previously, MarcEdit would check for an update every time the application was run. If an update had occurred, the program would prompt the user for action. If the user cancelled the action, the program would re-prompt the user each time the program was started. Because many users work in environments where their updates are managed by central IT staff, this constant re-prompting was problematic. Often, it would lead to users simply disabling update notification.   To make this more user friendly, the new behavior works as follows: When the program determines an update has been made, the program will prompt the user. If the user takes no action, the program will no longer prompt for action, but instead will provide an icon denoting the presence of an update in the lower right corner, next to the Preferences shortcut.
  • Enhancement: Link Identifiers Tool: I’ve added support for MESH headings through the use of their beta SPARQL end-point. Records run through the linking tool with identified MESH headings will automatically be resolved against the NLM database.
  • Enhancement: SPARQL Browser: This was described in blog post: http://blog.reeset.net/archives/1632, but this is one a new tool added to the MARCNext toolkit.
  • Enhancement: RDF Toolkit: In building the SPARQL Browser, I integrated a new RDF toolkit into MarcEdit. At this point, the SPARQL Browser is the only resource making like use of its functionality – but I anticipate that this new functionality will be utilized for a variety of other functions as the cataloging community continues to explore new metadata models.
  • Bug Fix: Diacritic insertion via intellisense:  When typing a diacritic, selecting it by double clicking on the value would result in the file scrolling to the top.  The program now resets to the cursor position.  When the user just clicked enter to select the value, a new line was inserted behind the diacritic mnemonic — both of these have been fixed.
  • Bug Fix: Mac Threading issues: One of the things that came to my attention on the last update is that Mono has some issues when generating system dialog boxes within separate threads. It appears that the new garbage collector in Mono may be sweeping object pointers prematurely. The easy solution is to remove the need to generate these system messages or move them when necessary. This has been done. Immediately, this corrects the issue related to MarcEdit crashing when the update message was generated.
  • Bug Fix: Mac Fonts: I was having some trouble with Mac systems not failing graciously when the system requested a font not found on the system. In Windows and Linux implementations of Mono, the default process is to fall through the requested font family until an appropriate font was found. Under OSX, Mono’s behavior is different, with fonts not returning any value, defaulting to undefined blocks. I’ve reworked the font selection class to ensure that a fall back font is always selected on all systems – which has corrected this problem on the Mac.

The MarcEdit update is available for download from: http://marcedit.reeset.net/downloads for all systems. You may also download and update the application via the automatic updating utilizing from within MarcEdit itself.

Again, Happy Christmas!

–tr

 Posted by at 1:03 pm
Dec 192014
 

Over the past couple of weeks, I’ve been working on expanding the linking services that MarcEdit can work with in order to create identifiers for controlled terms and headings.  One of the services that I’ve been experimenting with is NLM’s beta SPARQL endpoint for MESH headings.  MESH has always been something that is a bit foreign to me.  While I had been a cataloger in my past, my primary area of expertise was with geographic materials (analog and digital), as well as traditional monographic data.  While MESH looks like LCSH, it’s quite different as well.  So, I’ve been spending some time trying to learn a little more about it, while working on a process to consistently query the endpoint to retrieve the identifier for a preferred Term. Its been a process that’s been enlightening, but also one that has led me to think about how I might create a process that could be used beyond this simple use-case, and potentially provide MarcEdit with an RDF engine that could be utilized down the road to make it easier to query, create, and update graphs.

Since MarcEdit is written in .NET, this meant looking to see what components currently exist that provide the type of RDF functionality that I may be needing down the road.  Fortunately, a number of components exist, the one I’m utilizing in MarcEdit is dotnetrdf (https://bitbucket.org/dotnetrdf/dotnetrdf/wiki/browse/).  The component provides a robust set of functionality that supports everything I want to do now, and should want to do later.

With a tool kit found, I spent some time integrating it into MarcEdit, which is never a small task.  However, the outcome will be a couple of new features to start testing out the toolkit and start providing users with the ability to become more familiar with a key semantic web technology,  SPARQL.  The first new feature will be the integration of MESH as a known vocabulary that will now be queried and controlled when run through the linked data tool.  The second new feature is a SPARQL Browser.  The idea here is to give folks a tool to explore SPARQL endpoints and retrieve the data in different formats.  The proof of concept supports XML, RDFXML, HTML. CSV, Turtle, NTriple, and JSON as output formats.  This means that users can query any SPARQL endpoint and retrieve data back.  In the current proof of concept, I haven’t added the ability to save the output – but I likely will prior to releasing the Christmas MarcEdit update.

Proof of Concept

While this is still somewhat conceptual, the current SPARQL Browser looks like the following:

image

At present, the Browser assumes that data resides at a remote endpoint, but I’ll likely include the ability to load local RDF, JSON, or Turtle data and provide the ability to query that data as a local endpoint.  Anyway, right now, the Browser takes a URL to the SPARQL Endpoint, and then the query.  The user can then select the format that the result set should be outputted.

Using NLM as an example, say a user wanted to query for the specific term: Congenital Abnormalities – utilizing the current proof of concept, the user would enter the following data:

SPARQL Endpoint: http://id.nlm.nih.gov/mesh/sparql

SPARQL Query:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX meshv: <http://id.nlm.nih.gov/mesh/vocab#>
PREFIX mesh: <http://id.nlm.nih.gov/mesh/>

SELECT distinct ?d ?dLabel 
FROM <http://id.nlm.nih.gov/mesh2014>
WHERE {
  ?d meshv:preferredConcept ?q .
  ?q rdfs:label 'Congenital Abnormalities' . 
  ?d rdfs:label ?dLabel . 
} 
ORDER BY ?dLabel 

Running this query within the SPARQL Browser produces a resultset that is formatted internally into a Graph for output purposes.

image

image

image

The images snapshot a couple of the different output formats.  For example, the full JSON output is the following:

{
  "head": {
    "vars": [
      "d",
      "dLabel"
    ]
  },
  "results": {
    "bindings": [
      {
        "d": {
          "type": "uri",
          "value": "http://id.nlm.nih.gov/mesh/D000013"
        },
        "dLabel": {
          "type": "literal",
          "value": "Congenital Abnormalities"
        }
      }
    ]
  }
}

The idea behind creating this as a general purpose tool, is that in theory, this should work for any SPARQL endpoint.   For example, the Project Gutenberg Metadata endpoint.  The same type of exploration can be done, utilizing the Browser.

image

Future Work

At this point, the SPARQL Browser represents a proof of concept tool, but one that I will make available as part of the MARCNext research toolset:

image

As part of the next update.  Going forward, I will likely refine the Browser based on feedback, but more importantly, start looking at how the new RDF toolkit might allow for the development of dynamic form generation for editing RDF/BibFrame data…at least somewhere down the road.

–TR

[1] SPARQL (W3C): http://www.w3.org/TR/rdf-sparql-query/
[2] SPARQL (Wikipedia): http://en.wikipedia.org/wiki/SPARQL
[3] SPARQL Endpoints: http://www.w3.org/wiki/SparqlEndpoints
[4] MarcEdit: http://marcedit.reeset.net
[5] MARCNext: http://blog.reeset.net/archives/1359

Nov 292014
 

While experimenting with doing automatic language translation using the Microsoft Translation API, I got a couple of questions from users asking if this same process could be applied to doing automatic field translation to create localized searching indexes of subject terms.  The specific use case proposed was the generation of a single 653 that included automated translations of the 650$a.  Since this is likely a pretty specific use case with a limited audience, I’ve created this process as a plug-in.  If you are interested in seeing how this works, please see the following video:

If you have questions, let me know.

 

–tr

 Posted by at 9:27 am
Oct 162014
 

As libraries begin to join and participate in systems to test Bibframe principles, my hope is that when possible, I can provide support through MarcEdit to provide these communities a conduit to simplify the publishing of information into those systems.  The first of these test systems is the Libhub Initiative, and working with Eric Miller and the really smart folks at Zepheira (http://zepheira.com/), have created a plug-in specifically for libraries and partners working with the LibHub initiative.  The plug-in provides a mechanism to publish a variety of metadata formats into the system – from MARC, MARCXML, EAD, and MODS data – the process will hopefully help users contribute content and help spur discussion around the data model Zepheira is employing with this initiative.

For the time being, the plug-in is private, and available to any library currently participating in the LibHub project.  However, my understanding is that as they continue to ramp up the system, the plugin will be made available to the general community at large.

For now, I’ve published a video talking about the plug-in and demonstrating how it works.  If you are interested, you can view the video on YouTube.

 

–tr

 Posted by at 8:19 pm