MarcEdit and the OCLC Metadata API: Introduction
I wanted to note that I’ve updated this post to correct/clarify two statements within this post.
- The requirement of 2 wskeys
OCLC has two wskey structures. For those developers that have been working with OCLC for a long time and have a wskey for their search services, OCLC can decommission your former key and create a new one that supports all functionality or they can give you a second key for the Metadata API. For new users, you simply need to request a key that includes specific functionality. In MarcEdit, I will continue to keep the key’s separate for configuration purposes, but this value can be the same.
I’ve been starting to work on a couple of different write-ups around working with the OCLC Metadata API (http://oclc.org/developer/services/worldcat-metadata-api) – my general impressions and some specific improvements that really need to be in place to make this resource more usable. But I also have realized that I have neglected to give any kind of write-up about using the API within MarcEdit, save for an early YouTube video. So, I’m taking a little bit of time here to jot down some information related to how the API can be utilized within MarcEdit.
So a little bit of background. Over the past couple of years, I’ve been looking for opportunities to make use of the great work that OCLC Research does in exposing some interesting new data streams to the library community. When OCLC first released their classify service a number of years ago, I believe I was probably one of the first people to grab it and integrate it into a product that could be used within existing cataloging workflows. This year, I expanded the service to include integration with OCLC’s FAST headings service. While OCLC’s services do have some baggage attached (in terms of who is allowed to use them), the baggage is pretty small and they provide real value to the library community. Providing them through MarcEdit made a lot of sense.
The same can be true of OCLC’s new Metadata API. For a number of years, I’ve been having individuals in the user community looking for the ability to interact directly with WorldCat, specifically around the ability to set and delete holdings. The API provides this functionality, in addition to the ability to add and edit bibliographic records, as well as functionality around local data records. For the first round of integration, I’ve limited my work primarily to dealing with holdings and bibliographic data. I’ll be looking for people interested in the local data functionality and see how MarcEdit might be augmented to support that as well.
Early Use cases
Part of the reason I chose to work and support the specific API actions that I did was due to existing use cases. Processing around E-books and E-Resources have lead users within the MarcEdit community to make a couple common requests related to OCLC and WorldCat. Specifically, users are asking to:
- Automate the process of setting holdings
- Automate the upload of new records without using Connexion
- Automate Headings validation
The API provides the ability to address the first two questions, and I hope with time, OCLC will make some of their heading validation services available to support the 3rd request. But for now, MarcEdit’s development has really been focused around supporting these two specific use cases.
I’ll take some time to provide some additional feedback in a few days, but my first impressions were mixed. One the one hand, interacting with the service once I understood what the API expected in terms of requests and responses was pretty straightforward. On the other hand, very little documentation exists. Often times, I would initiate purposeful errors because simple things, like acceptable schemas, are left undocumented but are provided in an error message. Unfortunately, the missing documentation definitely complicates working with the service, as things related to validation, reasons for authentication errors, etc. simply are not documented and whose descriptions are fairly cryptic when reading as a response from the API.
Using the API in MarcEdit
So, to use the OCLC Metadata API in MarcEdit, there are a couple things that you need to do. First, you need to request your OCLC keys. MarcEdit’s integration requires users the appropriate Wskeys:
- OCLC’s Search API Key
- OCLC’s Metadata API Key
For long-term OCLC Platform Users (think Search API) – this means requesting two keys due to the fact that the previous key format isn’t compatible with their new authentication system. For new users, a single key should suffice. Regardless, a key that can support OCLC’s Search functionality is required to support MarcEdit’s ability to query the WorldCat database.
Likewise, MarcEdit needs a key that can utilize the Metadata API key. Again, for legacy API users, this will likely be a separate key – for new users – the search and metadata keys should be the same. This key has two parts – there is the Key and a Secret key. You need both to make requests. Likewise, there are 3 other values that users need to know to utilize the Metadata API services, and that is what is called a principalID a principalIDNS, a schema, a holdings symbol and a 4 character holdings code. All of these values need to be available in order to work with the various functions provided by the API.
Once you have these keys and supplemental information, you need to enter them into MarcEdit. Users will be able to see the menu entries for many of the OCLC functions, but until a user has entered their OCLC key data into the MarcEdit preferences and validated the data – these functions will remain disabled.
In the Preference’s area, there is a new tab that has been setup to store data related to OCLC’s Metadata services.
From this screen, users can enter all the information that they will need in order to utilize the various Metadata and Search API functions. MarcEdit will store this information in it’s configuration file, but does encrypt the data using a methodology that would make it impractical to reconstruct the key data.
After a user enters their data – the user should Click Apply, and then Validate. The Validate process is what enables the OCLC API integration functions. MarcEdit performs a couple API operations on an existing test record to determine if the information that user provided will support the required functionality. As MarcEdit validates a key, it will turn the textbox either green (for validated) or red (to indicate a problem).
Once the data has been validated – the OCLC Menu Items found in the Main Window and on the MarcEditor will be enabled.
Using the OCLC API Functions
Hopefully, the functions are fairly straightforward to use and look identical regardless of where you interact with them within the application. At present, MarcEdit provides a function for setting holdings and working with bibliographic data.
MarcEdit provides a straightforward tool for updating a user’s holdings for a batch of records. MarcEdit’s holdings tool can process either MARC data, MarcEdit’s mnemonic data or a plain text file of OCLC numbers. The tool does exactly what it says. Using the information set in the Preferences, MarcEdit will pre-populate your Institution Code and Classification Schema. Users then need to select an action. MarcEdit’s batch tool can update or delete holdings, but will perform just one of those actions on all records within a designated file. That means, you cannot upload a file that contains records that need to have holdings added and deleted. The tool requires that these actions are isolated into discrete operations.
Update/Creating Bibliographic Data
When working with MarcEdit’s bibliographic updating/creation tool for WorldCat – this tool does allow users to work with files that contain records that are both new or updated. OCLC’s system and MarcEdit evaluates records within a file and based on the presence or absence of an OCLC number in a record will determine if a bibliographic record is to be created or updated. Of course, it’s not that simple – all bibliographic records passed through the API must be validated on the OCLC side, and at this point, that validation only occurs when the data is submitted for update or creation – a weakness that I hope at some point will be rectified since this causes a number of issues when working in a batch record environment.
Working within the MarcEditor
When working within the MarcEditor, the tools for working with Holdings and Bibliographic data are identical to outside. The main difference is that the data being acted upon is the data in the MarcEditor. This means that the MarcEditor needs to include one additional function, and that is the ability to retrieve data from WorldCat. MarcEdit has the ability to search and download a record set from WorldCat for editing.
The search tool provides a handy way for users to quickly retrieve data from WorldCat for edit. However, this too underlines one of the glaring issues with the API – especially around record editing. Traditionally, when catalogers work on a record, they can lock the record for editing. This prevents other users from changing the record while they are working on it, and protects the transaction information in the 005 which OCLC requires for updating purposes. When working with the API, there appears to be no way to lock a record. This means that records can be downloaded, edited and then on upload, be invalid due to someone else making an edit that updates the 005 information. And since OCLC doesn’t provide a method for validating information prior to upload, users won’t know that this problem has occurred until an upload is attempted. Again, in order to make the API functionally equivalent to OCLC’s other editing services, there needs to be a way for catalogers to lock records for editing.
Where we go from here
This is really hard to say. I believe that OCLC sees the release of the Metadata APIs as the first of a much larger portfolio of API services to provide programmatic support for their WorldShare endeavors. So, it will be interesting to see how these develop. At the same time, I’m really interested in seeing if the organization puts the type of resources behind the development of their API Services Portfolio to provide really meaningful services that librarians, developers, and 3-party library developers can work with, consume, or augment the wide range of data streams that they have available. On the one hand, I’m hopeful that the release of the Metadata API signals are wider commitment to providing transparent access to a wide variety of services. At the same time, the lack of development around the OCLC search API – which has seen only incremental changes since they were first released some 5 or so years ago – makes me wonder how much support within the larger organization exists around this type of work. So I guess that is a question that remains to be answered.