MarcEdit 7/3 OCLC Integration Bug Fixes

MarcEdit 7.0.60+/3.0.30+’s OCLC Integration is currently broken, and it took me a little longer than I’d hoped to fix it.  Unfortunately, the work needed to support the changes were pretty extensive on the Mac-side due to either a framework change or a change in how OCLC’s API was returning it’s data (I honestly can’t tell which, though it really doesn’t matter at this point).  Anyway — this week, I found out that MarcEdit’s OCLC  integration framework wasn’t work in MarcEdit 7 — specifically, after 7.0.60+.  This is when MarcEdit shifted to using Access Tokens (which OCLC would prefer) over the older HMAC authentication when working with APIs that require Authentication (like the Metadata API).

In order to support Access Tokens and not break functionality for existing users, I coded MarcEdit to make use of the OCLC Registry API.  Within MarcEdit, I made use of the Registry API to translate between the entered OCLC Symbol (like OREU) and the institution’s OCLC registry id, which the access token authentication method requires.  Internally, the program would do an initial lookup using the registry to capture OCLC Symbols and Registry Ids so that the tool could  transparently switch between the values as required.  And one of the benefits of using the OCLC Registry API was that this was an open service, and it was a service that I could query via a HEAD request — allowing for a very lite touch on the service.  Though, in July 2018, things changed.

As mentioned above, I started hearing that the integration authentication was starting to fail around mid-July (thanks to vacations) — which I figured was likely a problem on the user-side (there is a lot of data that needs to be entered, and if anything is wrong, authentication doesn’t work) — but after working with a community member and looking at the logs, OCLC’s Karen Coombs let me know that the problem appeared to be related not to the authentication code, but to MarcEdit’s reliance on the Registry API.  Karen had pointed me to a blog post that I’d missed from back in Mar. 2018 (https://www.oclc.org/developer/news/2018/changing-coming-to-worldcat-registry-api.en.html)…OCLC decided that it was in their best interest to no longer keep the Registry API open, so they had decided to put it behind the authentication service and then change the service endpoints.  This raised two immediate problems for MarcEdit users.

  1. All authentication after 7.0.60 made use of the Registry API to support seamless authentication and use of the access token authentication method.  Changing the service API point essentially broke MarcEdit Integration for all users working with MarcEdit 7.0.60+ and 3.0.30+.  I believe users with early versions (that still relied on the HMAC processing) still work.
  2. By placing the Registry API behind their authentication service and removing the Registry API from its open services, users now have to request that their API keys be able to access an additional API point.  Since you cannot manage authentications or user names yourself (you must contact OCLC), fixing the code on my end wouldn’t necessarily correct the issue for users as all API users would need to reach out to OCLC and fix their key permissions (adding access to the new Registry API endpoint)

 

Ok, so this was all a bit of a headache, and I share some blame on this one.  On the one hand, I missed OCLC’s announcement.  Had I been paying attention to their Developers Blog (and I’ll be honest, I don’t really track it), I would have seen that this change was coming and likely wouldn’t have implemented the Access Token support until after this change took place — as I could have worked with OCLC a bit on what this would mean and we could have worked on ways to mitigate the impact of the change.  I obviously didn’t do that…that’s my bad.  On the other hand, I really, really, really wish OCLC would have proactively spoken to the Developer Network community to see how people were using the Registry API, because this change broke not only the MarcEdit Integration, but a number of other lookup services I have that used the registry to automatically configure connections related to OPACs, Link Resolvers, etc.  In these cases, I was using OCLC’s Registry API precisely because it was open and didn’t require an API key (which, I honestly wouldn’t be able to use for many of these purposes)…so, they’ve essentially rendered a rich set of information and a service that could be theoretically used for infrastructure purposes useless.  While I understand that there were likely concerns related to some of the information (I’m guessing the personal contact information was problematic) that the API made publicly available — I honestly think that a compromise could have been made.  Their current API structure definitely has a pattern for this.  The Search API uses “profiles”, where some authenticated users can see full data, while others can only see a subset of the data.  It seems this kind of model could have been adopted or discussed.  But at this point, that is neither here nor there.

I was told about the problems related to the API changes on Monday.  At the time, I assumed that fixing the change would be straightforward — and on Windows, it was.  I made a couple tweaks to the oclc_api library and the process resolved itself.  So, then I ported the code to the MacOS version and all Hell broke loose.  Here’s the problem, either due to a bug in the Framework (possible) or a change in how the API service works (also possible), calls made to the Meatadata API were failing.  Essentially, when you request a MARC record via the API, the service uses a Transfer-Encoding set to chunked.  Makes sense, but the problem was that in testing, these chunks weren’t returning complete, causing the connection to reset.  The other issue (which is why I’m not sure if this is a framework or server change) is that authentication against the OCLC server certificates were periodically breaking.  I could fix this by adding code that just ignored the certificate all together, but that seemed like a terrible idea.  Complicating matters, the code handling this on the Object-C side are interacting with native Mac APIs that hide a lot of what is actually happening underneath.  Reading the Swift coding boards, I found a number of folks having similar issues, and the solution appeared to be opening and managing raw socket connections (which I really, really didn’t want to end up doing).  Anyway, to make a long story short, I did have to rewrite a good deal of this component.  I shifted away from using the .NET System.Net.HttpWebRequest and HttpWebResponse (and equivalent MacOS elements) components and moved up to a more abstracted HttpClient code (System.Net.Http.HttpClient) to handle interactions.  This hides even more of the interactions, but it also seems to fix a number of the issues that the primitive classes were having.  Why?  Who knows.  All, that I know, is that it works now.  If you are interested in the changes, the oclc_api integration code has always been viewable here: https://github.com/reeset/oclc_api

Ok, so the code works — that solves issue #1, but issue #2 would still be (and may be for a short while longer) outstanding.  OCLC’s move to put this API under their Authentication system means that users will need their API keys to support the following APIs (Search API, Metadata API, Registry API).  Even with these code changes, the process will still fail as long as a users key hasn’t been setup to utilize the Registry API key…and that would be a real drag.  Fortunately, OCLC (particularly Karen), has been thinking about how they can proactively fix this problem.  And while I don’t understand how they will make this happen, I’ve been told that a process is in place to update all applicable existing api keys so that the keys can use the Registry API (assuming that their organization should have access to it).  I’m not exactly sure when this will happen (maybe it has) but this means that most users won’t need to do anything but update the program to fix the OCLC Integration.

For this update, I addressed the following OCLC Integration issues in MarcEdit 7/3:

  1. Bug Fix: OCLC Integration — A change to the OCLC API has broken MarcEdit’s OCLC Integration.  The code has been updated, though the change on the OCLC side requires that users have access to the Search API, the Metadata API, and the Registry API (used to find Registry Ids and OCLCHoldings codes).
  2. Bug Fix: OCLC Integration: Due to a change to the api, the Get Locations link in the integration was broken.  This has been corrected.
  3. Enhancement: OCLC Integration: MarcEdit has always required that users enter their OCLC code, but OCLC is moving towards using registry ids for most operations.  You can now enter either and MarcEdit will use the Registry API to get the correct information based on data passed.

 

Now, one thing I will say — this was a significant update because I had to completely change the underlying code and components that interact with the Metadata API.  If you run into problems, let me know.  On the Mac-side, it was a complete redo.  There are some things I can’t test directly with my OCLC key due to permissions (I’m not allowed to set/unset holdings for example), so I’ll be keeping an eye out in case users continue to struggle with the Integration.

Updates can be found at: https://marcedit.reeset.net/downloads or via the automatic update functionality.

If you have questions, please let  me know.

–tr


Posted

in

by

Tags: