OCLC WorldCat Metadata API Ruby Gem

 OCLC, ruby  Comments Off on OCLC WorldCat Metadata API Ruby Gem
May 142014

Since last December, I’ve had the opportunity to spend a good deal of time working with the OCLC WorldCat Metadata API.  The focus was primarily around kicking the tires, and then eventually developing some integration components with MarcEdit, as well as a C# library (https://github.com/reeset/oclc_api) for those that may have use of such things.

However, most folks in libraries tend to not use C# – they tend to focus on lighter-weight languages like ruby, python, or PHP.  Well, I work in ruby as well, so I decided to take the C# library and port it to Ruby.  The resulting code can be found here: https://github.com/reeset/wc_metadata_api

It’s pretty straightforward – and provides wrappers for all the current functionality found in the API, with the caveat being that the defined transfer format between OCLC and the client is in MARCXML, and response formats are in application/atom+xml.  The API supports a handful of other formats, but I’ve standardized on MARCXML since it’s well understood and there are lots of tools that can work with it.

The code isn’t well commented at this point.  I’ll take some time over the next few days to add some commenting, improve exception handling, and possibly add a few helper functions to make message processing easier – but this should help anyone interested in knowing how the API works get a better understanding.


Code Example:

** Get Local Bib Record

require ‘rubygems’
require ‘wc_metadata_api’

key = ‘[your key]’
secret = ‘[your secret]’
principalid = ‘[your principal_id]’
principaldns = ‘[your principal_idns]’
schema = ‘LibraryOfCongress’
holdingLibraryCode='[your holding code]’
instSymbol = ‘[your oclc symbol]’

client = WC_METADATA_API::Client.new(:wskey => key, :secret => secret, :principalID => principalid, :principalDNS => principaldns, :debug =>false)

response = client.WorldCatReadLocalBibRecord(:oclcNumber =>’338544583′, :schema => schema, :holdingLibraryCode => holdingLibraryCode, :instSymbol => instSymbol)

puts response

 Posted by at 10:31 pm

MarcEdit and the OCLC Metadata API: Introduction

 MarcEdit, OCLC, Programming  Comments Off on MarcEdit and the OCLC Metadata API: Introduction
Nov 052013


I wanted to note that I’ve updated this post to correct/clarify two statements within this post. 

  1. The requirement of 2 wskeys
  2. Terms of use

OCLC has two wskey structures.  For those developers that have been working with OCLC for a long time and have a wskey for their search services, OCLC can decommission your former key and create a new one that supports all functionality or they can give you a second key for the Metadata API.  For new users, you simply need to request a key that includes specific functionality.  In MarcEdit, I will continue to keep the key’s separate for configuration purposes, but this value can be the same.

Secondly – related to terms of use.  I need to clarify – if you are developer and have been using the WorldCat Platform Services outside of the Search API, the terms to use this service as a developer are no different from the other licensing terms you or your institution may have agreed to.  However, if you are a cataloger, the terms to retrieve/create a developer key are very likely different from the terms of service associated with your license related to the cataloging service.  Users need to be aware of these differences because your organization may/will care.


I’ve been starting to work on a couple of different write-ups around working with the OCLC Metadata API (http://oclc.org/developer/services/worldcat-metadata-api) – my general impressions and some specific improvements that really need to be in place to make this resource more usable.  But I also have realized that I have neglected to give any kind of write-up about using the API within MarcEdit, save for an early YouTube video.  So, I’m taking a little bit of time here to jot down some information related to how the API can be utilized within MarcEdit.


So a little bit of background.  Over the past couple of years, I’ve been looking for opportunities to make use of the great work that OCLC Research does in exposing some interesting new data streams to the library community.  When OCLC first released their classify service a number of years ago, I believe I was probably one of the first people to grab it and integrate it into a product that could be used within existing cataloging workflows.  This year, I expanded the service to include integration with OCLC’s FAST headings service.  While OCLC’s services do have some baggage attached (in terms of who is allowed to use them), the baggage is pretty small and they provide real value to the library community.  Providing them through MarcEdit made a lot of sense.

The same can be true of OCLC’s new Metadata API.  For a number of years, I’ve been having individuals in the user community looking for the ability to interact directly with WorldCat, specifically around the ability to set and delete holdings.  The API provides this functionality, in addition to the ability to add and edit bibliographic records, as well as functionality around local data records.  For the first round of integration, I’ve limited my work primarily to dealing with holdings and bibliographic data.  I’ll be looking for people interested in the local data functionality and see how MarcEdit might be augmented to support that as well.

Early Use cases

Part of the reason I chose to work and support the specific API actions that I did was due to existing use cases.  Processing around E-books and E-Resources have lead users within the MarcEdit community to make a couple common requests related to OCLC and WorldCat.  Specifically, users are asking to:

  1. Automate the process of setting holdings
  2. Automate the upload of new records without using Connexion
  3. Automate Headings validation

The API provides the ability to address the first two questions, and I hope with time, OCLC will make some of their heading validation services available to support the 3rd request.  But for now, MarcEdit’s development has really been focused around supporting these two specific use cases.

First Impressions

I’ll take some time to provide some additional feedback in a few days, but my first impressions were mixed.  One the one hand, interacting with the service once I understood what the API expected in terms of requests and responses was pretty straightforward.  On the other hand, very little documentation exists.  Often times, I would initiate purposeful errors because simple things, like acceptable schemas, are left undocumented but are provided in an error message.  Unfortunately, the missing documentation definitely complicates working with the service, as things related to validation, reasons for authentication errors, etc. simply are not documented and whose descriptions are fairly cryptic when reading as a response from the API.

However, I think anyone that does development probably is use to working with scarce documentation, and to OCLC’s credit, they have been very responsive in providing answers to the holes that currently exist in the documentation.  From my perspective, the most concerning aspect of the API right now is the authentication.  From a developer’s standpoint, the authentication process is easy enough.  For the user however, the process for getting keys to utilize tools built upon the services is fairly problematic because the process assumes that users are developers and forces key requests and management through that portal.  Likewise, these keys come with new terms of use outside of your traditional cataloging licensing and may (likely) will need to be vetted through an organizations legal council.  Again, this is primarily a problem for non-developers – who have relationships with OCLC in ways outside of the WorldShare Platform Services…but these are primarily the users that this integration  in MarcEdit targets (i.e., catalogers, not developers).  This honestly is my biggest concern with the service right now (that and that keys are tied to individuals, not institutions) – bigger than issues related to documentation, validation, and the somewhat cryptic nature of the feedback received through the API.

Using the API in MarcEdit

So, to use the OCLC Metadata API in MarcEdit, there are a couple things that you need to do.  First, you need to request your OCLC keys.  MarcEdit’s integration requires users the appropriate Wskeys:

  1. OCLC’s Search API Key
  2. OCLC’s Metadata API Key

For long-term OCLC Platform Users (think Search API) – this means requesting two keys due to the fact that the previous key format isn’t compatible with their new authentication system.  For new users, a single key should suffice.  Regardless, a key that can support OCLC’s Search functionality is required to support MarcEdit’s ability to query the WorldCat database.

Likewise, MarcEdit needs a key that can utilize the Metadata API key.  Again, for legacy API users, this will likely be a separate key – for new users – the search and metadata keys should be the same.  This key has two parts – there is the Key and a Secret key.  You need both to make requests.  Likewise, there are 3 other values that users need to know to utilize the Metadata API services, and that is what is called a principalID a principalIDNS, a schema, a holdings symbol and a 4 character holdings code.  All of these values need to be available in order to work with the various functions provided by the API.

Once you have these keys and supplemental information, you need to enter them into MarcEdit.  Users will be able to see the menu entries for many of the OCLC functions, but until a user has entered their OCLC key data into the MarcEdit preferences and validated the data – these functions will remain disabled.

In the Preference’s area, there is a new tab that has been setup to store data related to OCLC’s Metadata services.


From this screen, users can enter all the information that they will need in order to utilize the various Metadata and Search API functions.  MarcEdit will store this information in it’s configuration file, but does encrypt the data using a methodology that would make it impractical to reconstruct the key data.

After a user enters their data – the user should Click Apply, and then Validate.  The Validate process is what enables the OCLC API integration functions.  MarcEdit performs a couple API operations on an existing test record to determine if the information that user provided will support the required functionality.  As MarcEdit validates a key, it will turn the textbox either green (for validated) or red (to indicate a problem).

Example of a problem with a key

Example of all keys validated

Once the data has been validated – the OCLC Menu Items found in the Main Window and on the MarcEditor will be enabled.

Main Window – OCLC Functions

MarcEditor OCLC Functions

Using the OCLC API Functions

Hopefully, the functions are fairly straightforward to use and look identical regardless of where you interact with them within the application.  At present, MarcEdit provides a function for setting holdings and working with bibliographic data.

Setting Holdings

MarcEdit’s Holdings Tool

MarcEdit provides a straightforward tool for updating a user’s holdings for a batch of records.  MarcEdit’s holdings tool can process either MARC data, MarcEdit’s mnemonic data or a plain text file of OCLC numbers.  The tool does exactly what it says.  Using the information set in the Preferences, MarcEdit will pre-populate your Institution Code and Classification Schema.  Users then need to select an action.  MarcEdit’s batch tool can update or delete holdings, but will perform just one of those actions on all records within a designated file.  That means, you cannot upload a file that contains records that need to have holdings added and deleted.  The tool requires that these actions are isolated into discrete operations.

Update/Creating Bibliographic Data

MarcEdit’s Bibliographic Tool

When working with MarcEdit’s bibliographic updating/creation tool for WorldCat – this tool does allow users to work with files that contain records that are both new or updated.  OCLC’s system and MarcEdit evaluates records within a file and based on the presence or absence of an OCLC number in a record will determine if a bibliographic record is to be created or updated.  Of course, it’s not that simple – all bibliographic records passed through the API must be validated on the OCLC side, and at this point, that validation only occurs when the data is submitted for update or creation – a weakness that I hope at some point will be rectified since this causes a number of issues when working in a batch record environment.

Working within the MarcEditor

When working within the MarcEditor, the tools for working with Holdings and Bibliographic data are identical to outside.  The main difference is that the data being acted upon is the data in the MarcEditor.  This means that the MarcEditor needs to include one additional function, and that is the ability to retrieve data from WorldCat.  MarcEdit has the ability to search and download a record set from WorldCat for editing.

Searching WorldCat from Within the MarcEditor

The search tool provides a handy way for users to quickly retrieve data from WorldCat for edit.  However, this too underlines one of the glaring issues with the API – especially around record editing.  Traditionally, when catalogers work on a record, they can lock the record for editing.  This prevents other users from changing the record while they are working on it, and protects the transaction information in the 005 which OCLC requires for updating purposes.  When working with the API, there appears to be no way to lock a record.  This means that records can be downloaded, edited and then on upload, be invalid due to someone else making an edit that updates the 005 information.  And since OCLC doesn’t provide a method for validating information prior to upload, users won’t know that this problem has occurred until an upload is attempted.  Again, in order to make the API functionally equivalent to OCLC’s other editing services, there needs to be a way for catalogers to lock records for editing.

Where we go from here

This is really hard to say.  I believe that OCLC sees the release of the Metadata APIs as the first of a much larger portfolio of API services to provide programmatic support for their WorldShare endeavors.  So, it will be interesting to see how these develop.  At the same time, I’m really interested in seeing if the organization puts the type of resources behind the development of their API Services Portfolio to provide really meaningful services that librarians, developers, and 3-party library developers can work with, consume, or augment the wide range of data streams that they have available.  On the one hand, I’m hopeful that the release of the Metadata API signals are wider commitment to providing transparent access to a wide variety of services.  At the same time, the lack of development around the OCLC search API – which has seen only incremental changes since they were first released some 5 or so years ago – makes me wonder how much support within the larger organization exists around this type of work.  So I guess that is a question that remains to be answered.



 Posted by at 1:32 pm

Can libraries really be effective advocates for open data?

 OCLC, Open Data  Comments Off on Can libraries really be effective advocates for open data?
Dec 232011

This is one of those questions that I ponder ever now and again, because I wonder how effective libraries really can be as open data advocates when our current practice demonstrates that we don’t fully believe in the concept.  Well, I should qualify that – we have no problem believing that other people have a moral obligation to make their research and data open to the world using the most permissive (CC0) licenses available, but we have an extremely difficult time doing the same.  That’s right, my name is Terry Reese and I’m a hypocrite, I mean librarian.

There are a lot of places where libraries could be doing much better in terms of how we manage our own “research assets”.  This includes how we manage the release and reuse of metadata and digital objects within our special and archival collections to how we manage really mundane information like bibliographic data found in library catalogs.  In a sense, this is our research data, and as a group, libraries continue to tell other communities that for the good of libraries, this data cannot be openly shared. 

This question of how committed the library community is to open data came up again this week.  The National Library of Sweden recently announced (http://www.kb.se/english/about/news/No-deal-with-OCLC/) that they were ending negotiations with OCLC regarding the use and reuse of WorldCat derived data within their national catalog.  The two organizations simply couldn’t come to an agreement due to the restrictions placed on the sharing and redistribution of WorldCat derived MARC data.  Essentially, the National Library of Sweden, it’s participant members and Europeana (the European Library) feel that library bibliographic data wants to be free.  OCLC (and to a large degree, many within its membership) disagree.  Hence the impasse, and the source of my dilemma. 

Within the U.S., it is nearly impossible to run a research library and not be a member of the OCLC cooperative.  And that’s not necessarily a bad thing.  OCLC is doing some great things.  Just this month, they release FAST (http://www.oclc.org/research/news/2011-12-14.htm) as linked data and they announced the WorldShare Platform (http://oclc.org/developer/platform).  Additionally, they continue to advocate for libraries and use their position within the library community to focus RLG on research pertinent to the library community (http://www.oclc.org/research/publications/reports.htm).  Heck, I doubt most ILS vendors would be putting so many resources into networked based ILS systems had OCLC not lit a fire under them with their WMS platform development.  And yet, for all OCLC is and has done for libraries, the WorldCat Rights and Responsibilities Statement (http://www.oclc.org/worldcat/recorduse/policy/default.htm) continues to put OCLC at odds with libraries and their missions.  For libraries that want to advocate for open data, OCLC’s position on data reuse is unfortunately making OCLC much more of a hindrance than a willing partner.

Which brings me back to my original question.  Can libraries really be effective advocates for open data, when we consider our own inability to make our most basic “research” data openly available.  I think that we can certainly try…we’ve always been a community that advocates pretty passionately for knowing what’s best for other people’s data.  And who knows, maybe if we advocate for open data long enough, we might begin to believe in it ourselves and become participants (rather than cheerleaders) in the open data community.


 Posted by at 10:17 pm
Feb 092011

During Code4Lib, I spent a little time playing with OCLC’s Classify service (http://oclc.org/developer/services/classify).  I’ve been working on adding a couple of functions into MarcEdit that will allow folks to leverage some of the OCLC web services.   Using OCLC’s Classify, I’ve been working on an experimental tool that would allow you to do batch classification of records based on OCLC’s classification.  Below, you will find the class and below that, you can find how this might look in MarcEdit.

OCLC Classify Class

 class oclc_api_classify
        public enum Class_Type {lcc = 0, dd=1}
        private string pLastError = "";

        public string LastError
            get { return pLastError; }
            set { pLastError = value; }
        private string MakeHTTPRequest(string url)
            System.Net.HttpWebRequest wr;
            System.IO.Stream objStream = null;
                wr = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(url);
                wr.UserAgent = "MarcEdit 5.2";

                System.Net.WebResponse response = wr.GetResponse();
                objStream = response.GetResponseStream();
                System.IO.StreamReader reader = new System.IO.StreamReader(objStream);
                string sresults = reader.ReadToEnd();
                return sresults;
            catch (Exception e)
                return e.ToString();

        public string DoSearch(string url, oclc_api_classify.Class_Type ctype)
            string sreturn = MakeHTTPRequest(url);
            //process results
            return ProcessClassify(sreturn, ctype);

        private string ProcessClassify(string s, oclc_api_classify.Class_Type ctype)
            System.Xml.XmlDocument objXML = new System.Xml.XmlDocument();
            System.Xml.XmlNamespaceManager oMan = new System.Xml.XmlNamespaceManager(objXML.NameTable);
            oMan.AddNamespace("oclc", "http://classify.oclc.org");

            System.Xml.XmlNodeList objList;

                if (ctype == Class_Type.dd)
                    objList = objXML.SelectNodes("//oclc:ddc/oclc:mostPopular", oMan);
                    if (objList != null)
                        foreach (System.Xml.XmlNode objN in objList)
                            if (objN.Attributes["nsfa"] != null)
                                return objN.Attributes["nsfa"].InnerText;
                    objList = objXML.SelectNodes("//oclc:lcc/oclc:mostPopular",oMan);
                    if (objList != null)
                        foreach (System.Xml.XmlNode objN in objList)
                            if (objN.Attributes["nsfa"] != null)
                                return objN.Attributes["nsfa"].InnerText;
                return "";
            catch (Exception e)
                LastError = e.ToString();
                return "";


In MarcEdit:

In MarcEdit, I’ve added an experimental dialog that does the batch classification search.  Here’s an example:


As you can see, I’m envisioning two options:

  1. Ability to process Individual Records
  2. Ability to process All Records in a file
  3. Ability to select suggested Dewey or Library of Congress work
  4. Ability to decide to insert data if fields only if call numbers are not present
  5. Ability to determine what field the call number should be inserted into.

I’m thinking this will show up in the next MarcEdit update, so long as OCLC’s terms and conditions ends up being something that I (and they) can live with.


 Posted by at 1:03 pm
Aug 162010

So this is what it has come to…two iconic (III and OCLC that is) library software vendors wrestling in court.  Not that I think this is all that surprising…too bad…but not particularly surprising.  In fact, I think that many folks in the library community probably seen this coming.  For me, the canary in the mine, so to speak, has been the OCLC record use policy revision process.  Starting with the first attempt in Nov. 2008 which plain sucked and finishing with the present revision, which pretty much guaranteed a lawsuit, the records use policy has been an interesting illustration of what is both working and broken with OCLC.  Why – because the record use policy linked the use of the records to OCLC’s WorldCat service.  The policy, as it stands, doesn’t include just a list of rights and responsibilities, but spells out why it is needed…to protect the WorldCat database.  My personal opinion, as written here, is that any record use policy should have been written separate from WorldCat.  Joining the two together is problematic, and this lawsuit demonstrates how. 

As more and more time passes, I’m convinced that OCLC, as it exists today, is of two minds.  There is the membership mind and the vendor mind.  The problem that libraries and library vendors face, is that in many circumstances, the vendor side of OCLC is unduly influencing the membership side of the organization.  I think that the final record use policy is a good example of this, as OCLC placed a number of artificial walls around the WorldCat database – walls that really do nothing but protect OCLC’s web scale initiatives by putting up artificial barriers.

This is one of the reasons why I had suggested the need for OCLC to consider break itself up (http://blog.reeset.net/archives/579) back in Nov. 2008.  The problem, as I seen it then, is that as long as OCLC continues to move and compete within the vendor space, this tension between itself as a membership governed coop and as a vender organization will be present.  It will become increasingly difficult for OCLC to make decisions solely in respect to the memberships long-term needs, if those decisions must also be measured against the products and services that OCLC’s vendor operations handle.  Regardless of how this lawsuit with III is resolved, I don’t think that it will be the end of the litigation.  Over the past year, I’ve encountered too many vendors that are becoming more open with their feelings that OCLC is unfairly hiding behind their status as a tax-free organization to monopolize the library space.  If OCLC can demonstrate that its web scale services will work for larger ACRL libraries, I think that more and more vendors will continue to push back.  Vendors will continue to make their displeasure known and the membership is going to have to ask itself how many lawsuits its willing to endure.

I still believe that the simplest solution to this, however, would be breaking up the OCLC organization.  I think it would ultimately be good for OCLC and libraries.  It would allow OCLC’s vendor units to compete without having to worry about potential lawsuits and it would give OCLC’s research arms the ability to do research without the ever constant need to productize their work.  And there is actually a number of precedents to this type of thinking.  Universities for one, do this all the time.  Publically funded research or development is commercialized under umbrella organizations.  Why couldn’t OCLC do the same.  

In some ways, I see this lawsuit mirroring the current discussions related to net neutrality.  Should all information be treated as equal on the web.  In a sense, that’s what III is asking OCLC to provide, a kind of net neutrality for libraries.  This idea that WorldCat represents a core information pipe within the library community, and should be made open to the entire library community (both vendor and non-vendor) for a reasonable fee.  Libraries currently pay a fee for access (our memberships) – likewise, vendors could pay a reasonable fee to access and develop against the WorldCat bibliographic and holdings database. 

To take this further, when considering net neutrality, I believe that the library community would be nearly unanimous in supporting this as a necessary requirement for the future.  It’s a problem that we feel we have a stake in since librarians ultimately trade in information.  Likewise, I think that libraries will need to decide what type of organization to they want OCLC to be.  As a membership organization, libraries still have some power to shape the long-term vision of the OCLC cooperative.  So I wonder when librarians will start to push OCLC to embrace the same tenants of open and fair access to data that we currently demand from other vendors/information providers.  OCLC is ours, and how we handle this resource will ultimately determine how others view the library community as a whole.  In the end, will we pursue OCLC to reflect the communities larger vision and reflect our professional ethics in respect to open data and cooperation or will the larger information community simply look at the library community as a bunch of hypocrites – hypocrites that demand open data from other communities but isn’t will to reciprocate when the data is our own. 


 Posted by at 7:00 pm
Apr 082010

So yesterday, a number of messages were sent out letting folks know that the OCLC Record Use Policy Council had completed their work and have provided a draft document to the OCLC member community for comment.  Sadly, I’m in the middle of doing other work (travelling mostly) – but I have had an opportunity to look over this draft and have passed on a handful of comments to OCLC and our OCLC representatives.  I thought I’d also provide a general gist of some of my comments to folks here since I’ve been asked privately for my impressions. 

So let’s start with the good.  I’ll admit that I was very much not a fan of the first draft released over the fall.  For my tastes, the first draft took something that was very much an informal policy statement that had served OCLC and its membership for many years, and turned it into a legal license – and did so in a way that I personally felt poisoned long-term innovation for libraries looking to work independent of OCLC.  Knowing a number of folks involved in writing the first draft of this policy, I know that wasn’t the intent but the first draft had problems…problems with the language and problems with the way in which it was created.  And to OCLC’s credit, they may not have ultimately agreed with the membership, but they did ultimately decide to withdraw that draft and make another attempt at updating the current guidelines.  And I think that they do deserve some kudos for admitting that they may not have gotten it right…*AND* providing a mechanism for comments to help guide the future effort.  I can respect that.

Looking over the new draft of the record use policy, there are a number of things that I like.  First, the document is written to a very narrow audience, OCLC members.  While the implications of the policy reach beyond the membership community, the Record Use Policy Council did do a good job defining the audience and speaking to that audience.  Secondly, the document is much easier on the eyes.  Not perfect (more below), but easier to read and more importantly, is being presented as a set of best practices and guidelines.  While I know that the spirit of the document is to provide a policy that will govern record use – my take is that the document is fuzzy enough that appropriate use is left, in large measure, up to the individual organizations.  Does this allow libraries to give records to 3rd party groups like Open Library or simply post their metadata, en-mass, to an ftp server for download?  If you believe that it does, I think so.  FAQ’s 3 and 6 try to address these questions and in answering, essentially boil the answer down to judgment.  I’m fairly certain that given the above example, OCLC would prefer Open Library to enter into an agreement with OCLC and I’m also fairly certain that just posting a continually updated version of your catalog’s bibliographic data for anonymous download via FTP may not be completely in the spirit of this revised Policy – but I think that it’s fuzzy enough that so long as an organization feels that they are within their rights to do so, they could certainly argue that they are.  And of course, I bring both of these examples up because, indeed, there are grass roots efforts at many organizations doing both.

What else do I like?  I like that OCLC is looking for comment.  They’ve provided a public forum for comment, an email address to send comments directly to the Records Use Council, encouraged members to contact their representatives on the user council.  There seems to be a real effort here to collect feedback and take the pulse of the community.  Now, one thing that seems less clear, is how this comment period will impact this draft.  The first attempt at a policy in fall ‘09 was so bad and caused so much consternation within the member community that withdrawing the draft really was the only option.  However, what’s unclear to me is how comments and feedback from community members will be utilized in updating this draft and really, what impact that they will have.  My guess is that most will read this document and figure that its good enough (not perfect), but given its much softer language be easier to agree to.  So given that, how effective will the comment period be and what will be the process used to ensure that this document can be fixed should situations arise down the road that call for it.

Finally, I like that OCLC has defined how they view WorldCat because it lets me see where they are coming from.  In a previous post, I’d described OCLC’s record stewardship and WorldCat as a part of the public commons.  And that’s my sincere belief.  Libraries (both members and non-members) have contributed bibliographic data to the cooperative and I think that how libraries utilize data extracted from the cooperative should be restriction free.  OCLC is free to license access to the collective database as a whole, but individually, once libraries have extracted that data – they cease to belong to the cooperative.  However, this policy clarifies how OCLC views the cooperative and the records within by clarifying their understanding of public goods and club goods, asserting that OCLC’s role, WorldCat and the bibliographic data entrusted to it, are the latter.  I may not agree with them, but it certainly helps when framing the discussion.

So that’s what I see as the good, and if I’ve understated it, I do think that this draft has moved in a much more positive direction.  So, how do I think that it could be improved?  Well, I’m going to give you two things because I think that they are both important.

  1. While the document has become much easier to read, its still way too long.  We are talking about sharing bibliographic and holdings data, not brain surgery.  Having recently spent time working with OCLC’s Office of Research, I know that OCLC can write very concise, easy to understand, terms of use documents.  That’s what I think we needed here.  The OCLC Record Use Policy Council was asked to provide a record use policy – what they gave us was an opus on member responsibility and rights.  I can tell that a lot of thought and work went into the document, but in the end, I really wish that some of that focus and brevity that we find from the Office of Research would have made its way into this document.
  2. In some ways, I’m still very uncomfortable with this policy statement, and I think it comes down to the tone and focus of the policy itself.  And I think my uneasiness can be summed up in the first objective of the draft:?Maximize the utility and viability of WorldCat, enable and facilitate innovation based on WorldCat, and support the substantial network of relationships between OCLC members and their partners in today’s information landscape, while sustaining the integrity of the WorldCat database and its economic underpinnings?

    I may be the only one, but I think that this Records Use Policy should be separated from WorldCat.  I think a more appropriate objective (and class of objectives) would have been:  to promote and facilitate research, discovery, innovation and growth of the OCLC membership community.  Really, WorldCat is a brand, its the current product OCLC uses to curate the bibliographic and holdings data entrusted to it by the member community.  While I agree that OCLC needs to be mindful of the economic viability of the WorldCat database, I don’t believe the the policy document created to govern data use and reuse should join the two together.  As a colleague said to me recently, the policy should focus on the core issues of bibliographic and holdings data use and reuse – not the viability of a specific product.  I think shifting the policy from a WorldCat focus to a community focused model would completely change the tone of the document and go a long way to solidify, in my mind, that OCLC is indeed working on a policy document that places the needs and welfare of the community above a specific product or product line.  I realize that it may not be fair, but that is the position that OCLC has placed itself in, living in both worlds as trusted library cooperative and friend to the library community and that of a pseudo-vendor – competing directly against other library vendors that serve both OCLC members and non-members alike. 

Anyway, those are my initial thoughts.  I’ll be giving this document a closer read and will continue to provide my thoughts and comments to OCLC and our representatives – and I encourage everyone do so as well. 



 Posted by at 6:19 pm

WorldCat API gem moved to rubyforge

 OCLC, rails, ruby  Comments Off on WorldCat API gem moved to rubyforge
Mar 132009

I had mentioned that I’d quickly developed a helper gem for simplifying using the WorldCat API in ruby (at least, it greatly simplifies using the API for my needs).  This was created in part for C4L and partly because I’m moving access from Z39.50 to the API when working with LF and I basically wanted a be able to do the search and interact with the results as a set of objects (rather and XML). 

Anyway, the first version of the gem (which is slightly different from the code first posted prior to C4L) can be found at the project page, here: http://rubyforge.org/projects/wcapi/


Feb 222009

Since the WorldCat Grid API became available, I’ve spent odd periods of time playing with the services to see what kinds of things I could do and not do with the information being made available.  This was partly due to a desire to create an open source version of WorldCat Local.  It wouldn’t have been functionally complete (lacking facets, some data, etc), but it would be good enough, most likely, for folks wanting to switch to a catalog interface that utilized WorldCat as their source database, rather than their local ILS dbs.  And while the project was pretty much completed, the present service terms the are attached to the API make this type of work out of bounds at present (though, I knew that when I started – it was more experimental with the hope that by the time I was finished, the API would be more open). 

Anyway…as part of the process, I started working on some rudimentary components for ruby to make processing the OCLC provided data easier.  Essentially, creating a handful of convenience functions that would make it much easier to call and utilize that data the grid services provided – my initial component was called wcapi. 

Tomorrow (or this morning if you are in Providence), a few folks from OCLC will be putting on a preconference related to the OCLC API at Code4Lib.  There, folks from OCLC will be giving some background and insight into the development of API, as well as looking at how the API works and some sample code snippets.  I’d hoped to be there (I’m signed up), but an airline snafu has me stranded in transit until tomorrow – so by the time I get to Providence, all of, or most of the pre-conference will be over.  Too bad, I know.  Fortunately, there will be someone else from Oregon State in the audience, so I should be able to get some notes.

However, there’s no reason why I can’t participate, even while I?m jetting towards Providence – and so I submit my wcapi gem/code to the preconference folks to use however they see fit.  It doesn’t have a rakefile or docs, but once you install the gem, there is a test.rb file in the wcapi folder that demonstrates how to use all the functions.  The wrapper functions accept the same values as the API – and like the API, you need to have your OCLC developer’s key to make it work.  The wcapi gem will utilize libxml for most xml processing (if it’s on your machine, otherwise it will downgrade to rexml), save for a handful of functions that done use named namespaces within the resultset (this confuses libxml for some reason – though their is a way around it, I just didn’t have the time to add it). 

So, the gem/code can be found at: wcapi.tar.gz.  Since I’ll likely make use of this in the future, I’ll likely move this to source forge in the near future.

*******Example OpenSearch Call*******

require ‘rubygems’
require ‘wcapi’

client = WCAPI::Client.new :wskey => ‘[insert_your_own_key_here’

response = client.OpenSearch(:q=>’civil war’, :format=>’atom’, :start => ‘1’, :count => ’25’, :cformat => ‘mla’)

puts "Total Results: " + response.header["totalResults"]
response.records.each {|rec|
  puts "Title: " + rec[:title] + "\n"
  puts "URL: " + rec[:link] + "\n"
  puts "OCLC #: " + rec[:id] + "\n"
  puts "Description: " + rec[:summary] + "\n"
  puts "citation: " + rec[:citation] + "\n"


********End Sample Call****************

And that’s it.  Pretty straightforward.  So, I hope that everyone has a great pre-conference.  I wish that I was there (and with some luck, I may catch the 2nd half) – but if not, I’ll see everyone in Providence Monday evening.


Jan 132009

From the OCLC news release: http://www.oclc.org/us/en/news/releases/20092.htm

First off, I think that this is a good thing for a couple of reasons.

  1. I take this development as a sign that OCLC has heard the membership, heard some of their concerns and has agreed to take some additional time to have further discussion with the membership.  For that, thank you.
  2. It gives the OCLC membership time to look at the presently proposed policy and consider if this policy reflects the direction that the membership wishes to take the library profession.

So, where do we go from here.  Looking over the news release, it appears that the proposed policy will be the base-line for discussion — the starting point.  So it will be up to the member council and identified library experts to look at this policy and consider how it can best be modified to meet the stated goals of modernizing the record use and transfer policy.  While I very much doubt (though, personally would like to) I will be one of the folks asked to the table, I still have hopes that this process could yield a much more open and friendly record transfer policy.  I’ve discussed this with some folks from the members council, but I honestly believe that OCLC and WorldCat would see the greatest benefit if it simply let WorldCat go — releasing the data through an Open Data license.  I realize that for some, there is a fear that this might devalue OCLC in some way — but I personally doubt it.  As a membership organization, OCLC is able to leverage is large repository of holdings information — which will continue to allow them to build new and interesting services for libraries — whether WorldCat is available for download or not.  I.E. for OCLC members, WorldCat likely isn’t the primary reason why they are members — so I have a difficult time seeing how loosening the reins on WorldCat could be problematic.

Anyway, the announcement is good news in that it will buy the OCLC membership (and library community) time to more fully consider what the current proposed transfer policy would mean to libraries and hopefully help shape a document that is much more in line with the library community’s core values related to the access and sharing of information.  And hopefully, as this policy is considered, the membership council does keep the library community (not just membership) in mind when considering this decision so that any provisions placed on the transfer of records from WorldCat doesn’t poison the transfer of records downstream for other members (both commercial and non-commercial) of the library community.


 Posted by at 4:43 pm