MarcEdit 5.9 – OCLC Integration: Working with Local Bibliographic Data Records

By reeset / On / In MarcEdit

Since OCLC made their Metadata APIs available, I’ve spent a good deal of time putting them through their paces and looking for use cases that these particular API might solve, and if/how MarcEdit might be an appropriate platform to take advantage of some of this new functionality.  Looking over the new Metadata API, it’s pretty easy to see where the focus was on – providing some capacity to have read/write access to the WorldCat database, primarily the global and local bibliographic data.  In addition, the API provided the ability to set institutional holdings on records, though not work with Local Holdings Records.

For that reason, the first round of MarcEdit development utilizing these API was focused primarily on working with the master bibliographic and institutional holdings records.  These are two areas where I tend to receive regular queries from users in the mist of reclamation projects or weeding projects and needing to make changes to hundreds or thousands of records in the WorldCat database.  The new APIs allowed me to provide an answer to the two most common questions: 1) Can I batch update/delete holdings on a particular OCLC records and 2) can I batch upload records to OCLC.  While I’d definitely argue (and I think OCLC probably would agree) that these APIs are not designed to be used with batch operations in mind (i.e., they are slow) – they finally offered a way to communicate with the WorldCat database in real-time…and regardless of performance, this feels/is a big win for libraries that choose to work with OCLC.  If you are interested in how this initial work was implemented, you can read about it here: http://blog.reeset.net/archives/1245

After releasing the first MarcEdit builds and making subsequent refinements to the integration work, I’ve had the opportunity to speak with more OCLC WMS users and individuals that make use of the Local Bibliographic Data files and decided that it was time to start working on adding support for this type of data.

The challenge with working with Local Bibliographic Data records is that there is no way (at least through the API) to know if a record has a local bibliographic record attached, without multiple queries and making a set of special calls to the API about a specific OCLC number.  This means that for all intensive purposes, the process of working with local bibliographic data in MarcEdit assumes one of two things:

  1. That the user knows what data has these records
  2. That the user is likely creating new ones.

Secondly, I’ve tried to integrate this new functionality into the existing tools developed for use with the other OCLC Metadata APIs.  So for example, users looking to query a set of records and retrieve the attached local bibliographic records now will see a new option on the OCLC Search box that denotes if the master or local bibliographic record should be extracted.

image

The problem, is at this point, MarcEdit doesn’t know if a local bibliographic record actually exists.  Certainly, the tool could pre-query every oclc number returned as part of a search, but that approach exponentially increases the back and forth communication between MarcEdit and OCLC via the API, and remember, this isn’t an API that feels designed for batch operations.  So, rather than query, MarcEdit provides an option to download the local bibliographic record if it is present.  If this checkbox isn’t selected, the program will download the master bibliographic record for edit.  For users looking to create a local bibliographic record, the process is the same.  Local bibliographic records must be attached to a master OCLC number – so a user would query a record, check to download the local bibliographic record, and then attempt the download.

When downloading a local bibliographic file, MarcEdit will prompt users, asking if the tool should automatically generate a local bibliographic file for edit in MarcEdit if one doesn’t exist on a master record.

image

To generate new records, MarcEdit utilizes a new template within the application – local_bib_record_template.mot.  The template contains the following data:

=LDR  00000n   a2200000   4500
=004  [OCLC_NUM]
=935  \\$a[SYSTEM_GENERATED]
=940  \\$a[OCLC_CODE]

This template looks slightly different than a normal MarcEdit template, in that it includes a number of data points that MarcEdit will automatically generate as part of the record generation process.  This is necessary because specific data must be present in order for a local bibliographic record to be valid.  For example, a local bibliographic record must have the OCLC number that it’s attached to – data that is found in the 004.  The 935 represents a local system generated number, a number MarcEdit generates as a timestamp indicating record creating time, and finally, the 940 includes the organizational code, or the code that normally would appear in the 040 of a bibliographic record.  This information is stored and used as part of the API profile, so MarcEdit includes that data in the record generation process.

A local bibliographic record is in many ways like the master bibliographic record, just with data only applicable to an institution.  OCLC has a set of documentation around what kinds of data can be stored in these records.  Using a test record in WorldCat, I extracted the following example:

image

In this record, the data breaks down in the following way:

  • 001 – this is the unique lhd control number
  • 004 – this is the oclc number for the master bibliographic record that this local record is attached to
  • 005 – this is the transaction data; this data must match when updating records as OCLC uses this as a point of validation.
  • 500, 790 – in this record, these are the local data fields…data that is separate from the master bibliographic record and visible only to my users who see our local bibliographic data.
  • 935 – this is a locally defined system number – when MarcEdit generates this number, it is a system timestamp.
  • 940 – this is the institution code.

As one can see, a local bibliographic record can be fairly brief (or verbose depending on the notes added to the record).

To update/create/delete a local bibliographic holdings file (single or batch) – a user would start with a local bibliographic record.  Even delete operations require the record – you cannot just pass the lbd a control number or list of oclc numbers – at least at this point.  The API requires the record to be sent through the service as a type of validation.

Updating this data requires using the Add/Delete/Update Local Bibliographic Data option:

image

Selecting this option will open the following window:

image

When dealing with Local Bibliographic data, the option will be selected, and the option to delete those records is also presented.  Users not working with local bibliographic data will see the same dialog, but the Process as Local Bibliographic Records option will be left unchecked and the Delete Records option will not be visible.  At this point, users can process their records, and MarcEdit will return the response codes provided by the API to determine if the updates were successful or not.

My guess is, that like the first pass through the API, the use of these methods and tools that make use of them will be refined with time and use, but I think that they provide a good start.  Of course, at this point, I’ve reached the limit  in terms of functionality that the Metadata API provides.  In looking at this toolset, it’s pretty clear that at this point, these API were primarily envisioned for individual, real-time editing of the WorldCat database.  I have a feeling that the batch holdings tools, and now the ability to upload bibliographic and local bibliographic data in batches probably fall outside of the identified use cases when OCLC first released these to the public.  But they work, though a little slowly, and provide some capacity to work directly with the WorldCat database.  At the same time, the API is limited and missing key features that folks are currently asking for – most notably the ability to work with local holdings data, the ability to validate records, and a better process for search and discovery (as the present Search API are woefully inadequate for nearly any, but for the most vanilla uses).  Hopefully, by doing some simple integration work in MarcEdit and providing some useful tools around the Metadata API, it can provide a catalysis for additional work, additional functionality, and additional innovation on the side of OCLC – and continue to push the cooperative to provide more transparent and deeper access to the WorldCat resources and holdings.

Finally, the functions discussed here will be made available for download on March 2.

–tr