MarcEdit and the OCLC Metadata API: Introduction

*******************************************************************************************************************************************************

I wanted to note that I’ve updated this post to correct/clarify two statements within this post. 

  1. The requirement of 2 wskeys
  2. Terms of use

OCLC has two wskey structures.  For those developers that have been working with OCLC for a long time and have a wskey for their search services, OCLC can decommission your former key and create a new one that supports all functionality or they can give you a second key for the Metadata API.  For new users, you simply need to request a key that includes specific functionality.  In MarcEdit, I will continue to keep the key’s separate for configuration purposes, but this value can be the same.

Secondly — related to terms of use.  I need to clarify — if you are developer and have been using the WorldCat Platform Services outside of the Search API, the terms to use this service as a developer are no different from the other licensing terms you or your institution may have agreed to.  However, if you are a cataloger, the terms to retrieve/create a developer key are very likely different from the terms of service associated with your license related to the cataloging service.  Users need to be aware of these differences because your organization may/will care.

*******************************************************************************************************************************************************

I’ve been starting to work on a couple of different write-ups around working with the OCLC Metadata API (http://oclc.org/developer/services/worldcat-metadata-api) — my general impressions and some specific improvements that really need to be in place to make this resource more usable.  But I also have realized that I have neglected to give any kind of write-up about using the API within MarcEdit, save for an early YouTube video.  So, I’m taking a little bit of time here to jot down some information related to how the API can be utilized within MarcEdit.

Background:

So a little bit of background.  Over the past couple of years, I’ve been looking for opportunities to make use of the great work that OCLC Research does in exposing some interesting new data streams to the library community.  When OCLC first released their classify service a number of years ago, I believe I was probably one of the first people to grab it and integrate it into a product that could be used within existing cataloging workflows.  This year, I expanded the service to include integration with OCLC’s FAST headings service.  While OCLC’s services do have some baggage attached (in terms of who is allowed to use them), the baggage is pretty small and they provide real value to the library community.  Providing them through MarcEdit made a lot of sense.

The same can be true of OCLC’s new Metadata API.  For a number of years, I’ve been having individuals in the user community looking for the ability to interact directly with WorldCat, specifically around the ability to set and delete holdings.  The API provides this functionality, in addition to the ability to add and edit bibliographic records, as well as functionality around local data records.  For the first round of integration, I’ve limited my work primarily to dealing with holdings and bibliographic data.  I’ll be looking for people interested in the local data functionality and see how MarcEdit might be augmented to support that as well.

Early Use cases

Part of the reason I chose to work and support the specific API actions that I did was due to existing use cases.  Processing around E-books and E-Resources have lead users within the MarcEdit community to make a couple common requests related to OCLC and WorldCat.  Specifically, users are asking to:

  1. Automate the process of setting holdings
  2. Automate the upload of new records without using Connexion
  3. Automate Headings validation

The API provides the ability to address the first two questions, and I hope with time, OCLC will make some of their heading validation services available to support the 3rd request.  But for now, MarcEdit’s development has really been focused around supporting these two specific use cases.

First Impressions

I’ll take some time to provide some additional feedback in a few days, but my first impressions were mixed.  One the one hand, interacting with the service once I understood what the API expected in terms of requests and responses was pretty straightforward.  On the other hand, very little documentation exists.  Often times, I would initiate purposeful errors because simple things, like acceptable schemas, are left undocumented but are provided in an error message.  Unfortunately, the missing documentation definitely complicates working with the service, as things related to validation, reasons for authentication errors, etc. simply are not documented and whose descriptions are fairly cryptic when reading as a response from the API.

However, I think anyone that does development probably is use to working with scarce documentation, and to OCLC’s credit, they have been very responsive in providing answers to the holes that currently exist in the documentation.  From my perspective, the most concerning aspect of the API right now is the authentication.  From a developer’s standpoint, the authentication process is easy enough.  For the user however, the process for getting keys to utilize tools built upon the services is fairly problematic because the process assumes that users are developers and forces key requests and management through that portal.  Likewise, these keys come with new terms of use outside of your traditional cataloging licensing and may (likely) will need to be vetted through an organizations legal council.  Again, this is primarily a problem for non-developers — who have relationships with OCLC in ways outside of the WorldShare Platform Services…but these are primarily the users that this integration  in MarcEdit targets (i.e., catalogers, not developers).  This honestly is my biggest concern with the service right now (that and that keys are tied to individuals, not institutions) — bigger than issues related to documentation, validation, and the somewhat cryptic nature of the feedback received through the API.

Using the API in MarcEdit

So, to use the OCLC Metadata API in MarcEdit, there are a couple things that you need to do.  First, you need to request your OCLC keys.  MarcEdit’s integration requires users the appropriate Wskeys:

  1. OCLC’s Search API Key
  2. OCLC’s Metadata API Key

For long-term OCLC Platform Users (think Search API) — this means requesting two keys due to the fact that the previous key format isn’t compatible with their new authentication system.  For new users, a single key should suffice.  Regardless, a key that can support OCLC’s Search functionality is required to support MarcEdit’s ability to query the WorldCat database.

Likewise, MarcEdit needs a key that can utilize the Metadata API key.  Again, for legacy API users, this will likely be a separate key — for new users — the search and metadata keys should be the same.  This key has two parts — there is the Key and a Secret key.  You need both to make requests.  Likewise, there are 3 other values that users need to know to utilize the Metadata API services, and that is what is called a principalID a principalIDNS, a schema, a holdings symbol and a 4 character holdings code.  All of these values need to be available in order to work with the various functions provided by the API.

Once you have these keys and supplemental information, you need to enter them into MarcEdit.  Users will be able to see the menu entries for many of the OCLC functions, but until a user has entered their OCLC key data into the MarcEdit preferences and validated the data — these functions will remain disabled.

In the Preference’s area, there is a new tab that has been setup to store data related to OCLC’s Metadata services.

image

From this screen, users can enter all the information that they will need in order to utilize the various Metadata and Search API functions.  MarcEdit will store this information in it’s configuration file, but does encrypt the data using a methodology that would make it impractical to reconstruct the key data.

After a user enters their data — the user should Click Apply, and then Validate.  The Validate process is what enables the OCLC API integration functions.  MarcEdit performs a couple API operations on an existing test record to determine if the information that user provided will support the required functionality.  As MarcEdit validates a key, it will turn the textbox either green (for validated) or red (to indicate a problem).

image
Example of a problem with a key

image
Example of all keys validated

Once the data has been validated — the OCLC Menu Items found in the Main Window and on the MarcEditor will be enabled.

image
Main Window — OCLC Functions

image
MarcEditor OCLC Functions

Using the OCLC API Functions

Hopefully, the functions are fairly straightforward to use and look identical regardless of where you interact with them within the application.  At present, MarcEdit provides a function for setting holdings and working with bibliographic data.

Setting Holdings

image
MarcEdit’s Holdings Tool

MarcEdit provides a straightforward tool for updating a user’s holdings for a batch of records.  MarcEdit’s holdings tool can process either MARC data, MarcEdit’s mnemonic data or a plain text file of OCLC numbers.  The tool does exactly what it says.  Using the information set in the Preferences, MarcEdit will pre-populate your Institution Code and Classification Schema.  Users then need to select an action.  MarcEdit’s batch tool can update or delete holdings, but will perform just one of those actions on all records within a designated file.  That means, you cannot upload a file that contains records that need to have holdings added and deleted.  The tool requires that these actions are isolated into discrete operations.

Update/Creating Bibliographic Data

image
MarcEdit’s Bibliographic Tool

When working with MarcEdit’s bibliographic updating/creation tool for WorldCat — this tool does allow users to work with files that contain records that are both new or updated.  OCLC’s system and MarcEdit evaluates records within a file and based on the presence or absence of an OCLC number in a record will determine if a bibliographic record is to be created or updated.  Of course, it’s not that simple — all bibliographic records passed through the API must be validated on the OCLC side, and at this point, that validation only occurs when the data is submitted for update or creation — a weakness that I hope at some point will be rectified since this causes a number of issues when working in a batch record environment.

Working within the MarcEditor

When working within the MarcEditor, the tools for working with Holdings and Bibliographic data are identical to outside.  The main difference is that the data being acted upon is the data in the MarcEditor.  This means that the MarcEditor needs to include one additional function, and that is the ability to retrieve data from WorldCat.  MarcEdit has the ability to search and download a record set from WorldCat for editing.

image
Searching WorldCat from Within the MarcEditor

The search tool provides a handy way for users to quickly retrieve data from WorldCat for edit.  However, this too underlines one of the glaring issues with the API — especially around record editing.  Traditionally, when catalogers work on a record, they can lock the record for editing.  This prevents other users from changing the record while they are working on it, and protects the transaction information in the 005 which OCLC requires for updating purposes.  When working with the API, there appears to be no way to lock a record.  This means that records can be downloaded, edited and then on upload, be invalid due to someone else making an edit that updates the 005 information.  And since OCLC doesn’t provide a method for validating information prior to upload, users won’t know that this problem has occurred until an upload is attempted.  Again, in order to make the API functionally equivalent to OCLC’s other editing services, there needs to be a way for catalogers to lock records for editing.

Where we go from here

This is really hard to say.  I believe that OCLC sees the release of the Metadata APIs as the first of a much larger portfolio of API services to provide programmatic support for their WorldShare endeavors.  So, it will be interesting to see how these develop.  At the same time, I’m really interested in seeing if the organization puts the type of resources behind the development of their API Services Portfolio to provide really meaningful services that librarians, developers, and 3-party library developers can work with, consume, or augment the wide range of data streams that they have available.  On the one hand, I’m hopeful that the release of the Metadata API signals are wider commitment to providing transparent access to a wide variety of services.  At the same time, the lack of development around the OCLC search API — which has seen only incremental changes since they were first released some 5 or so years ago — makes me wonder how much support within the larger organization exists around this type of work.  So I guess that is a question that remains to be answered.

 

–TR


Posted

in

, ,

by

Tags: