Dec 232011
 

This is one of those questions that I ponder ever now and again, because I wonder how effective libraries really can be as open data advocates when our current practice demonstrates that we don’t fully believe in the concept.  Well, I should qualify that – we have no problem believing that other people have a moral obligation to make their research and data open to the world using the most permissive (CC0) licenses available, but we have an extremely difficult time doing the same.  That’s right, my name is Terry Reese and I’m a hypocrite, I mean librarian.

There are a lot of places where libraries could be doing much better in terms of how we manage our own “research assets”.  This includes how we manage the release and reuse of metadata and digital objects within our special and archival collections to how we manage really mundane information like bibliographic data found in library catalogs.  In a sense, this is our research data, and as a group, libraries continue to tell other communities that for the good of libraries, this data cannot be openly shared. 

This question of how committed the library community is to open data came up again this week.  The National Library of Sweden recently announced (http://www.kb.se/english/about/news/No-deal-with-OCLC/) that they were ending negotiations with OCLC regarding the use and reuse of WorldCat derived data within their national catalog.  The two organizations simply couldn’t come to an agreement due to the restrictions placed on the sharing and redistribution of WorldCat derived MARC data.  Essentially, the National Library of Sweden, it’s participant members and Europeana (the European Library) feel that library bibliographic data wants to be free.  OCLC (and to a large degree, many within its membership) disagree.  Hence the impasse, and the source of my dilemma. 

Within the U.S., it is nearly impossible to run a research library and not be a member of the OCLC cooperative.  And that’s not necessarily a bad thing.  OCLC is doing some great things.  Just this month, they release FAST (http://www.oclc.org/research/news/2011-12-14.htm) as linked data and they announced the WorldShare Platform (http://oclc.org/developer/platform).  Additionally, they continue to advocate for libraries and use their position within the library community to focus RLG on research pertinent to the library community (http://www.oclc.org/research/publications/reports.htm).  Heck, I doubt most ILS vendors would be putting so many resources into networked based ILS systems had OCLC not lit a fire under them with their WMS platform development.  And yet, for all OCLC is and has done for libraries, the WorldCat Rights and Responsibilities Statement (http://www.oclc.org/worldcat/recorduse/policy/default.htm) continues to put OCLC at odds with libraries and their missions.  For libraries that want to advocate for open data, OCLC’s position on data reuse is unfortunately making OCLC much more of a hindrance than a willing partner.

Which brings me back to my original question.  Can libraries really be effective advocates for open data, when we consider our own inability to make our most basic “research” data openly available.  I think that we can certainly try…we’ve always been a community that advocates pretty passionately for knowing what’s best for other people’s data.  And who knows, maybe if we advocate for open data long enough, we might begin to believe in it ourselves and become participants (rather than cheerleaders) in the open data community.

–TR

 Posted by at 10:17 pm
Feb 092011
 

During Code4Lib, I spent a little time playing with OCLC’s Classify service (http://oclc.org/developer/services/classify).  I’ve been working on adding a couple of functions into MarcEdit that will allow folks to leverage some of the OCLC web services.   Using OCLC’s Classify, I’ve been working on an experimental tool that would allow you to do batch classification of records based on OCLC’s classification.  Below, you will find the class and below that, you can find how this might look in MarcEdit.

OCLC Classify Class

 class oclc_api_classify
    {
        public enum Class_Type {lcc = 0, dd=1}
        private string pLastError = "";

        public string LastError
        {
            get { return pLastError; }
            set { pLastError = value; }
        }
        private string MakeHTTPRequest(string url)
        {
            System.Net.HttpWebRequest wr;
            System.IO.Stream objStream = null;
            try
            {
                wr = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(url);
                wr.UserAgent = "MarcEdit 5.2";

                System.Net.WebResponse response = wr.GetResponse();
                objStream = response.GetResponseStream();
                System.IO.StreamReader reader = new System.IO.StreamReader(objStream);
                string sresults = reader.ReadToEnd();
                reader.Close();
                return sresults;
            }
            catch (Exception e)
            {
                return e.ToString();
            }
        }

        public string DoSearch(string url, oclc_api_classify.Class_Type ctype)
        {
            string sreturn = MakeHTTPRequest(url);
            //process results
            return ProcessClassify(sreturn, ctype);
        }

        private string ProcessClassify(string s, oclc_api_classify.Class_Type ctype)
        {
            System.Xml.XmlDocument objXML = new System.Xml.XmlDocument();
            objXML.LoadXml(s);
            System.Xml.XmlNamespaceManager oMan = new System.Xml.XmlNamespaceManager(objXML.NameTable);
            oMan.AddNamespace("oclc", "http://classify.oclc.org");
           

            System.Xml.XmlNodeList objList;

            try
            {
                if (ctype == Class_Type.dd)
                {
                    objList = objXML.SelectNodes("//oclc:ddc/oclc:mostPopular", oMan);
                    if (objList != null)
                    {
                        foreach (System.Xml.XmlNode objN in objList)
                        {
                           
                            if (objN.Attributes["nsfa"] != null)
                            {
                                return objN.Attributes["nsfa"].InnerText;
                                break;
                            }
                        }
                    }
                }
                else
                {
                    
                    objList = objXML.SelectNodes("//oclc:lcc/oclc:mostPopular",oMan);
                    if (objList != null)
                    {
                        foreach (System.Xml.XmlNode objN in objList)
                        {
                            if (objN.Attributes["nsfa"] != null)
                            {
                                return objN.Attributes["nsfa"].InnerText;
                                break;
                            }
                        }
                    }
                    else
                    {
                        //System.Windows.Forms.MessageBox.Show("miss");
                    }
                }
                return "";
            }
            catch (Exception e)
            {
                LastError = e.ToString();
                return "";
            } 
        }

        
    }

In MarcEdit:

In MarcEdit, I’ve added an experimental dialog that does the batch classification search.  Here’s an example:

image

As you can see, I’m envisioning two options:

  1. Ability to process Individual Records
  2. Ability to process All Records in a file
  3. Ability to select suggested Dewey or Library of Congress work
  4. Ability to decide to insert data if fields only if call numbers are not present
  5. Ability to determine what field the call number should be inserted into.

I’m thinking this will show up in the next MarcEdit update, so long as OCLC’s terms and conditions ends up being something that I (and they) can live with.

–TR

 Posted by at 1:03 pm
Aug 162010
 

So this is what it has come to…two iconic (III and OCLC that is) library software vendors wrestling in court.  Not that I think this is all that surprising…too bad…but not particularly surprising.  In fact, I think that many folks in the library community probably seen this coming.  For me, the canary in the mine, so to speak, has been the OCLC record use policy revision process.  Starting with the first attempt in Nov. 2008 which plain sucked and finishing with the present revision, which pretty much guaranteed a lawsuit, the records use policy has been an interesting illustration of what is both working and broken with OCLC.  Why – because the record use policy linked the use of the records to OCLC’s WorldCat service.  The policy, as it stands, doesn’t include just a list of rights and responsibilities, but spells out why it is needed…to protect the WorldCat database.  My personal opinion, as written here, is that any record use policy should have been written separate from WorldCat.  Joining the two together is problematic, and this lawsuit demonstrates how. 

As more and more time passes, I’m convinced that OCLC, as it exists today, is of two minds.  There is the membership mind and the vendor mind.  The problem that libraries and library vendors face, is that in many circumstances, the vendor side of OCLC is unduly influencing the membership side of the organization.  I think that the final record use policy is a good example of this, as OCLC placed a number of artificial walls around the WorldCat database – walls that really do nothing but protect OCLC’s web scale initiatives by putting up artificial barriers.

This is one of the reasons why I had suggested the need for OCLC to consider break itself up (http://people.oregonstate.edu/~reeset/blog/archives/579) back in Nov. 2008.  The problem, as I seen it then, is that as long as OCLC continues to move and compete within the vendor space, this tension between itself as a membership governed coop and as a vender organization will be present.  It will become increasingly difficult for OCLC to make decisions solely in respect to the memberships long-term needs, if those decisions must also be measured against the products and services that OCLC’s vendor operations handle.  Regardless of how this lawsuit with III is resolved, I don’t think that it will be the end of the litigation.  Over the past year, I’ve encountered too many vendors that are becoming more open with their feelings that OCLC is unfairly hiding behind their status as a tax-free organization to monopolize the library space.  If OCLC can demonstrate that its web scale services will work for larger ACRL libraries, I think that more and more vendors will continue to push back.  Vendors will continue to make their displeasure known and the membership is going to have to ask itself how many lawsuits its willing to endure.

I still believe that the simplest solution to this, however, would be breaking up the OCLC organization.  I think it would ultimately be good for OCLC and libraries.  It would allow OCLC’s vendor units to compete without having to worry about potential lawsuits and it would give OCLC’s research arms the ability to do research without the ever constant need to productize their work.  And there is actually a number of precedents to this type of thinking.  Universities for one, do this all the time.  Publically funded research or development is commercialized under umbrella organizations.  Why couldn’t OCLC do the same.  

In some ways, I see this lawsuit mirroring the current discussions related to net neutrality.  Should all information be treated as equal on the web.  In a sense, that’s what III is asking OCLC to provide, a kind of net neutrality for libraries.  This idea that WorldCat represents a core information pipe within the library community, and should be made open to the entire library community (both vendor and non-vendor) for a reasonable fee.  Libraries currently pay a fee for access (our memberships) – likewise, vendors could pay a reasonable fee to access and develop against the WorldCat bibliographic and holdings database. 

To take this further, when considering net neutrality, I believe that the library community would be nearly unanimous in supporting this as a necessary requirement for the future.  It’s a problem that we feel we have a stake in since librarians ultimately trade in information.  Likewise, I think that libraries will need to decide what type of organization to they want OCLC to be.  As a membership organization, libraries still have some power to shape the long-term vision of the OCLC cooperative.  So I wonder when librarians will start to push OCLC to embrace the same tenants of open and fair access to data that we currently demand from other vendors/information providers.  OCLC is ours, and how we handle this resource will ultimately determine how others view the library community as a whole.  In the end, will we pursue OCLC to reflect the communities larger vision and reflect our professional ethics in respect to open data and cooperation or will the larger information community simply look at the library community as a bunch of hypocrites – hypocrites that demand open data from other communities but isn’t will to reciprocate when the data is our own. 

–TR

 Posted by at 7:00 pm
Apr 082010
 

So yesterday, a number of messages were sent out letting folks know that the OCLC Record Use Policy Council had completed their work and have provided a draft document to the OCLC member community for comment.  Sadly, I’m in the middle of doing other work (travelling mostly) – but I have had an opportunity to look over this draft and have passed on a handful of comments to OCLC and our OCLC representatives.  I thought I’d also provide a general gist of some of my comments to folks here since I’ve been asked privately for my impressions. 

So let’s start with the good.  I’ll admit that I was very much not a fan of the first draft released over the fall.  For my tastes, the first draft took something that was very much an informal policy statement that had served OCLC and its membership for many years, and turned it into a legal license – and did so in a way that I personally felt poisoned long-term innovation for libraries looking to work independent of OCLC.  Knowing a number of folks involved in writing the first draft of this policy, I know that wasn’t the intent but the first draft had problems…problems with the language and problems with the way in which it was created.  And to OCLC’s credit, they may not have ultimately agreed with the membership, but they did ultimately decide to withdraw that draft and make another attempt at updating the current guidelines.  And I think that they do deserve some kudos for admitting that they may not have gotten it right…*AND* providing a mechanism for comments to help guide the future effort.  I can respect that.

Looking over the new draft of the record use policy, there are a number of things that I like.  First, the document is written to a very narrow audience, OCLC members.  While the implications of the policy reach beyond the membership community, the Record Use Policy Council did do a good job defining the audience and speaking to that audience.  Secondly, the document is much easier on the eyes.  Not perfect (more below), but easier to read and more importantly, is being presented as a set of best practices and guidelines.  While I know that the spirit of the document is to provide a policy that will govern record use – my take is that the document is fuzzy enough that appropriate use is left, in large measure, up to the individual organizations.  Does this allow libraries to give records to 3rd party groups like Open Library or simply post their metadata, en-mass, to an ftp server for download?  If you believe that it does, I think so.  FAQ’s 3 and 6 try to address these questions and in answering, essentially boil the answer down to judgment.  I’m fairly certain that given the above example, OCLC would prefer Open Library to enter into an agreement with OCLC and I’m also fairly certain that just posting a continually updated version of your catalog’s bibliographic data for anonymous download via FTP may not be completely in the spirit of this revised Policy – but I think that it’s fuzzy enough that so long as an organization feels that they are within their rights to do so, they could certainly argue that they are.  And of course, I bring both of these examples up because, indeed, there are grass roots efforts at many organizations doing both.

What else do I like?  I like that OCLC is looking for comment.  They’ve provided a public forum for comment, an email address to send comments directly to the Records Use Council, encouraged members to contact their representatives on the user council.  There seems to be a real effort here to collect feedback and take the pulse of the community.  Now, one thing that seems less clear, is how this comment period will impact this draft.  The first attempt at a policy in fall ‘09 was so bad and caused so much consternation within the member community that withdrawing the draft really was the only option.  However, what’s unclear to me is how comments and feedback from community members will be utilized in updating this draft and really, what impact that they will have.  My guess is that most will read this document and figure that its good enough (not perfect), but given its much softer language be easier to agree to.  So given that, how effective will the comment period be and what will be the process used to ensure that this document can be fixed should situations arise down the road that call for it.

Finally, I like that OCLC has defined how they view WorldCat because it lets me see where they are coming from.  In a previous post, I’d described OCLC’s record stewardship and WorldCat as a part of the public commons.  And that’s my sincere belief.  Libraries (both members and non-members) have contributed bibliographic data to the cooperative and I think that how libraries utilize data extracted from the cooperative should be restriction free.  OCLC is free to license access to the collective database as a whole, but individually, once libraries have extracted that data – they cease to belong to the cooperative.  However, this policy clarifies how OCLC views the cooperative and the records within by clarifying their understanding of public goods and club goods, asserting that OCLC’s role, WorldCat and the bibliographic data entrusted to it, are the latter.  I may not agree with them, but it certainly helps when framing the discussion.

So that’s what I see as the good, and if I’ve understated it, I do think that this draft has moved in a much more positive direction.  So, how do I think that it could be improved?  Well, I’m going to give you two things because I think that they are both important.

  1. While the document has become much easier to read, its still way too long.  We are talking about sharing bibliographic and holdings data, not brain surgery.  Having recently spent time working with OCLC’s Office of Research, I know that OCLC can write very concise, easy to understand, terms of use documents.  That’s what I think we needed here.  The OCLC Record Use Policy Council was asked to provide a record use policy – what they gave us was an opus on member responsibility and rights.  I can tell that a lot of thought and work went into the document, but in the end, I really wish that some of that focus and brevity that we find from the Office of Research would have made its way into this document.
  2. In some ways, I’m still very uncomfortable with this policy statement, and I think it comes down to the tone and focus of the policy itself.  And I think my uneasiness can be summed up in the first objective of the draft:?Maximize the utility and viability of WorldCat, enable and facilitate innovation based on WorldCat, and support the substantial network of relationships between OCLC members and their partners in today’s information landscape, while sustaining the integrity of the WorldCat database and its economic underpinnings?

    I may be the only one, but I think that this Records Use Policy should be separated from WorldCat.  I think a more appropriate objective (and class of objectives) would have been:  to promote and facilitate research, discovery, innovation and growth of the OCLC membership community.  Really, WorldCat is a brand, its the current product OCLC uses to curate the bibliographic and holdings data entrusted to it by the member community.  While I agree that OCLC needs to be mindful of the economic viability of the WorldCat database, I don’t believe the the policy document created to govern data use and reuse should join the two together.  As a colleague said to me recently, the policy should focus on the core issues of bibliographic and holdings data use and reuse – not the viability of a specific product.  I think shifting the policy from a WorldCat focus to a community focused model would completely change the tone of the document and go a long way to solidify, in my mind, that OCLC is indeed working on a policy document that places the needs and welfare of the community above a specific product or product line.  I realize that it may not be fair, but that is the position that OCLC has placed itself in, living in both worlds as trusted library cooperative and friend to the library community and that of a pseudo-vendor – competing directly against other library vendors that serve both OCLC members and non-members alike. 

Anyway, those are my initial thoughts.  I’ll be giving this document a closer read and will continue to provide my thoughts and comments to OCLC and our representatives – and I encourage everyone do so as well. 

Thanks,

–TR

 Posted by at 6:19 pm
Mar 132009
 

I had mentioned that I’d quickly developed a helper gem for simplifying using the WorldCat API in ruby (at least, it greatly simplifies using the API for my needs).  This was created in part for C4L and partly because I’m moving access from Z39.50 to the API when working with LF and I basically wanted a be able to do the search and interact with the results as a set of objects (rather and XML). 

Anyway, the first version of the gem (which is slightly different from the code first posted prior to C4L) can be found at the project page, here: http://rubyforge.org/projects/wcapi/

–TR

Feb 222009
 

Since the WorldCat Grid API became available, I’ve spent odd periods of time playing with the services to see what kinds of things I could do and not do with the information being made available.  This was partly due to a desire to create an open source version of WorldCat Local.  It wouldn’t have been functionally complete (lacking facets, some data, etc), but it would be good enough, most likely, for folks wanting to switch to a catalog interface that utilized WorldCat as their source database, rather than their local ILS dbs.  And while the project was pretty much completed, the present service terms the are attached to the API make this type of work out of bounds at present (though, I knew that when I started – it was more experimental with the hope that by the time I was finished, the API would be more open). 

Anyway…as part of the process, I started working on some rudimentary components for ruby to make processing the OCLC provided data easier.  Essentially, creating a handful of convenience functions that would make it much easier to call and utilize that data the grid services provided – my initial component was called wcapi. 

Tomorrow (or this morning if you are in Providence), a few folks from OCLC will be putting on a preconference related to the OCLC API at Code4Lib.  There, folks from OCLC will be giving some background and insight into the development of API, as well as looking at how the API works and some sample code snippets.  I’d hoped to be there (I’m signed up), but an airline snafu has me stranded in transit until tomorrow – so by the time I get to Providence, all of, or most of the pre-conference will be over.  Too bad, I know.  Fortunately, there will be someone else from Oregon State in the audience, so I should be able to get some notes.

However, there’s no reason why I can’t participate, even while I?m jetting towards Providence – and so I submit my wcapi gem/code to the preconference folks to use however they see fit.  It doesn’t have a rakefile or docs, but once you install the gem, there is a test.rb file in the wcapi folder that demonstrates how to use all the functions.  The wrapper functions accept the same values as the API – and like the API, you need to have your OCLC developer’s key to make it work.  The wcapi gem will utilize libxml for most xml processing (if it’s on your machine, otherwise it will downgrade to rexml), save for a handful of functions that done use named namespaces within the resultset (this confuses libxml for some reason – though their is a way around it, I just didn’t have the time to add it). 

So, the gem/code can be found at: wcapi.tar.gz.  Since I’ll likely make use of this in the future, I’ll likely move this to source forge in the near future.

*******Example OpenSearch Call*******

require ‘rubygems’
require ‘wcapi’

client = WCAPI::Client.new :wskey => ‘[insert_your_own_key_here'

response = client.OpenSearch(:q=>'civil war', :format=>'atom', :start => '1', :count => '25', :cformat => 'mla')

puts "Total Results: " + response.header["totalResults"]
response.records.each {|rec|
  puts "Title: " + rec[:title] + "\n"
  puts "URL: " + rec[:link] + "\n"
  puts "OCLC #: " + rec[:id] + "\n"
  puts "Description: " + rec[:summary] + "\n"
  puts "citation: " + rec[:citation] + "\n"

}

********End Sample Call****************

And that’s it.  Pretty straightforward.  So, I hope that everyone has a great pre-conference.  I wish that I was there (and with some luck, I may catch the 2nd half) – but if not, I’ll see everyone in Providence Monday evening.

–TR

Jan 132009
 

From the OCLC news release: http://www.oclc.org/us/en/news/releases/20092.htm

First off, I think that this is a good thing for a couple of reasons.

  1. I take this development as a sign that OCLC has heard the membership, heard some of their concerns and has agreed to take some additional time to have further discussion with the membership.  For that, thank you.
  2. It gives the OCLC membership time to look at the presently proposed policy and consider if this policy reflects the direction that the membership wishes to take the library profession.

So, where do we go from here.  Looking over the news release, it appears that the proposed policy will be the base-line for discussion — the starting point.  So it will be up to the member council and identified library experts to look at this policy and consider how it can best be modified to meet the stated goals of modernizing the record use and transfer policy.  While I very much doubt (though, personally would like to) I will be one of the folks asked to the table, I still have hopes that this process could yield a much more open and friendly record transfer policy.  I’ve discussed this with some folks from the members council, but I honestly believe that OCLC and WorldCat would see the greatest benefit if it simply let WorldCat go — releasing the data through an Open Data license.  I realize that for some, there is a fear that this might devalue OCLC in some way — but I personally doubt it.  As a membership organization, OCLC is able to leverage is large repository of holdings information — which will continue to allow them to build new and interesting services for libraries — whether WorldCat is available for download or not.  I.E. for OCLC members, WorldCat likely isn’t the primary reason why they are members — so I have a difficult time seeing how loosening the reins on WorldCat could be problematic.

Anyway, the announcement is good news in that it will buy the OCLC membership (and library community) time to more fully consider what the current proposed transfer policy would mean to libraries and hopefully help shape a document that is much more in line with the library community’s core values related to the access and sharing of information.  And hopefully, as this policy is considered, the membership council does keep the library community (not just membership) in mind when considering this decision so that any provisions placed on the transfer of records from WorldCat doesn’t poison the transfer of records downstream for other members (both commercial and non-commercial) of the library community.

–TR

 Posted by at 4:43 pm
Nov 052008
 

OCLC has posted a revised version of it’s new policy for use and transfer of WorldCat Records.  This is a revision of the short-lived version of this document posted Nov. 2nd and taken down early Nov. 3rd.  Additionally, OCLC has significantly revised the FAQ’s for this document (which is important, since this is what most people will use to interpret this agreement). 

Links:

 

Background:

When OCLC released their initial revision of this policy (Nov. 2), a number of concerns were raised by myself and the library community.  However, of all the concerns, there were two really, really big concerns that I consider to be deal breakers with the policy as proposed on Nov. 2.  These were:

  1. The embedding of this policy into the OCLC record in the form of a 996 field essentially placing a license within the library’s bibliographic records.  While I know that OCLC didn’t feel that this was a big deal — it was.  It’s a tremendously big deal because it takes ownership of metadata away from the library, and gives it to OCLC.  And it’s dangerous — as a link to a policy, terms and rights can be changed (as we have seen between the Nov. 2 and the Nov. 5th revisions).  This is one of the reasons why licenses have versions…so rights granted to users at one period of time won’t change if a license undergoes revision.
  2. The non-compete clause that essentially barred use of the WorldCat data for the creation of services that competed against OCLC.  And while this was meant primarily to discourage Commercial development, it could very easily have applied to non-commercial research projects — if such a project began to encroach upon the size or function of the WorldCat database.

So those were my big concerns.  With that in mind, I want to look at the revised policy specifically to see how these two issues have been addressed.

The 996 field:

While OCLC’s policy and FAQ’s have been significantly revised, there are few places where this is as evident as in the areas of attribution.  In the Nov. 2nd version of the policy, OCLC’s policy stated that transferred records “must be clearly identified” and that the OCLC record number, if available, “may not be removed from any WorldCat Record”.   The Nov. 5th revision is significantly different in this area — changing “must” and “will” to “OCLC encourages”.  In essence, this part of the agreement has been made optional.  The revised FAQ’s further spell this out by saying: “It is optional for OCLC members and nonmember libraries, archives and museums to implement or retain the 996 field for attributing WorldCat as the source of WorldCat-derived records.” 

For this change, I’m definitely pleased.  Though, now that this has been fixed, I’m going to be honest, I want to see OCLC to push this even further.  In the FAQ, OCLC makes a case for attribution.  That’s fair.  However, there is still a fundamental problem with the approach being used:  that being that this license is unversioned.  This policy will change again at some point and it’s likely that some previously legal uses of the metadata records may fall outside of the boundaries of the new conditions.  By having a versioned license, previous uses of the metadata would be grandfathered.  So, if OCLC continues to want to move down this path of attribution and linking to it’s policy, then I would like to see the 996 field revised to something like the following:

Field: 996  WorldCat Record Use Policy Link
1st indicator: undefined
2nd indicator: undefined

Subfields:
$a: Function identifier
$i: Display Text
$v: Version Number
$u URI

By versioning the license, it would at least close this future ambiguity.  Were the versioning component added to this field, I’d feel a little bit better about it.  Good enough to let this licensing info into my library’s catalog — probably not, but at least I’d feel better about it.

The non-compete clause:

Well, this is still pretty much in the policy — though, the policy has been modified to reflect that this is primarily geared towards commercial (or third party) use.  Of course, there still remains the clause that defines “Reasonable Use” as uses (commercial or non-commercial) that:

1) discourages the contribution of bibliographic and holdings data to WorldCat, thus damaging OCLC Members’ investment in WorldCat, and/or

2) substantially replicates the function, purpose, and/or size of WorldCat. Please see the FAQ for a discussion of Z39.50 for cataloging using WorldCat-derived bibliographic records.

While OCLC may argue that this would allow libraries to participate in projects like the Open Library, I have a feeling that that’s not the case, for two reasons.  First, Open Library does substantially replicate function, purpose and size — but more importantly, Open Library couldn’t accept metadata records that include the OCLC policy of transfer (the licensing terms of currently incompatible with those that Open Library uses last I checked).  But more importantly, I believe that this non-compete clause likely makes projects like the one being undertaken by some ILS vendors to create a type of cooperative catalog for users of it’s ILS systems (since here, we are not talking about Z39.50, but an actual creation of a network based service that would provide like a Napster-like functionality allowing users to select records from their ILS network) or stop the creation of a like-minded service by a group of open source developers that may wish to create or build a cataloging service from records from other libraries (say, from the records on Open Library for example).  Of course, if that’s not the case, it would be nice if OCLC would provide examples of this type of use within it’s FAQ documentation.

Of course, in many respects, the problem that I have with the OCLC non-compete clause is in many ways philosophical.  Libraries have always been about unfettered access, including access to our metadata, so I have a very difficult time believing that we need to somehow quickly move to protect our metadata from unscrupulous commercial uses.  I’ve said it a number of times, if the library community (and OCLC) want to view WorldCat as the Library Commons then it should be open, to anyone, for any reason…period.  These types of non-compete clauses are difficult for me because they are contrary to many of our core library values.   But I won’t rehash this since I did write an earlier thought piece on the subject (http://oregonstate.edu/~reeset/blog/archives/579) a few days ago.

Anyway — as far as the policy goes.  I think that OCLC has done a good job working to answer some of the initial criticisms that I know that I personally had with the new policy.  Would I like to see them make additional changes…yes.  If OCLC is determined to add a license to the Commons (I guess I’ll have to stop calling it the Commons then), then they need to version the license to reduce potential future ambiguity.  Do I like that OCLC wants to make themselves a not so silent partner in my library’s 3rd party dealings…no.  Would I recommend that my library (or any other library) remove the 996 licensing when downloading data from OCLC…at this point, you bet.  OCLC can certainly make this available in the WorldCat database, but I still don’t want it in mine…but I guess there’s those darn philosophical differences again.

 

–TR

 Posted by at 1:34 pm
Nov 032008
 

The other day, I posted what I seen as some very big concerns with OCLC’s revised policy (currently being reconsidered) on the transfer of records (two of which, I would consider deal breakers).  In this post, I made the argument that maybe it was time to consider breaking OCLC up to reflect what it has become — an organization with two distinct facets: a membership component and a vendor component.  This comment led to a conversation from someone at OCLC who questioned whether I honestly believed that the library community would be better off if OCLC was broken up and it was obvious from our conversation that on this point, we would simply need to agree to disagree.  As a side note, I think that these types of disagreements and conversations are actually really important to have.  I’m always nervous of communities or groups in which everyone agrees since it usually means that people either are not thinking critically or no really cares.  Secondly, I think that we all (OCLC and myself for that matter) want what’s best for the library community — we just have different visions of what that might be. 

Anyway, back to my topic.  Now, I’m going to preface this discussion by saying that this is obviously my own opinion and one that may not be shared by many people within the library community (I really have no idea).  Even within the library open source community, where I’m sure this opinion would be more prevalent (or at least entertained), I’m pretty sure I’m still in the minority.  But as I say, I think that these conversations are important to consider — specifically as we move down a path where OCLC is very quickly positioning themselves to become the library community’s default service provider for all things library (in terms of ILL, ILS interface, cataloging, etc.).

So when I talk about breaking up OCLC, exactly what am I’m talking about?  Well, in order to follow me down the path that I am going to take you, we have to talk about OCLC as I currently see them.  Watching OCLC during the 10 years (I can’t believe it’s actually been 10 years) that I have been in libraries, I have seen a quickening evolution of OCLC from strictly a member driven organization to more of a hybrid organization.  On the one hand, there is what many would consider the membership side of OCLC, that being WorldCat, ILL and their research and development office.  On the other hand, there is OCLC’s vendor arm…a good example of this would be WorldCat Local and WorldCat Navigator.  So how do I make these distinctions — membership services are those that I would consider core services.  These are services that OCLC has developed to add value to what OCLC likes to refer to as the Library Commons (WorldCat).  OCLC’s vendor services are those tools or programs that OCLC sells on top of the Library Commons, of which, I think WorldCat Local/Navigator is a good example.  Now I think that at this point, I know that folks at OCLC (and likely in the membership) would argue that both WorldCat Local/Navigator do provide services that the OCLC membership is currently requesting.  I won’t deny that — however, I would answer that the fact that OCLC treats the Library Commons (WorldCat) as it’s own closed personal community has the unintended affect of limiting the library community’s (and I include both commercial and non-commercial entities in my definition of community) ability to develop new service models.  In effect, we become much more dependent on how OCLC envisions the future of libraries.  Let me try and tease this out a little bit more…

Philosophically, the biggest problem that I have with the current situation is the commingling of OCLC’s treatment of the Commons (WorldCat) and their current strategy of being the sole commercial entity with the ability to interact with the Commons.  I’m a firm believer that the more diverse the landscape or ecology, the more likely that innovation will take place.  We’ve seen this time and time again both inside (Evergreen and Koha certainly have shaken up the traditional ILS market) and outside (web browsers are a good example of how competition breeds innovation) the library community.  However, by isolating the Commons, OCLC is threatening this diversity of thought.  Now, I have a whole set of different issues with the current library ILS community, but in this case, I think that OCLC’s treatment of the Commons, and their ability to leverage that service unfairly skews the ability for both commercial and non-commercial entities to provide innovative services on top of those Commons (and before anyone jumps on me for non-commercial use, let me finish my thoughts here).  Commercially, I’m fairly certain that the current crop of ILS vendors would very much like to provide their own WorldCat Local/Navigator interfaces to their customers, and I’m sure, would be able to tie these interfaces closely with services already provided by the users ILS.  I could envision things like ERM (electronic resource management), simplified requesting, etc. all being possible if the likes of ExLibris or Innovative Interfaces were allowed to build tools upon the Library Commons (WorldCat).  Maybe I would like to develop my own version of WorldCat Local/Navigator that interacts with the Commons and sell it as a product (kind of the same way ezproxy was sold prior to being acquired by OCLC) or a group of researchers would like to do the same.  As a commercial entity, I’m fairly certain that this type of development model wouldn’t be kosher with OCLC unless I licensed access to WorldCat (and I’m not certain that they would given that this would compete against one of their services).  Likewise, open source folks like LibLime or Equinox may like to create an open source version of the WorldCat Local interface.  Under the current guidelines, I understand that an open source implementation of WorldCat Local can exist — but as I understand that agreement, I’m not certain that groups like LibLime or Equinox (or another entity) could not take that project and then sell support-based services around it (I’m unclear on that one though).  However, it’s very unlikely that the library world will see any of these types of developments (well, maybe the open source WorldCat Local since I have a group that could use this and a number of people interested in developing it) because OCLC has come to treat what it calls the Commons (WorldCat), as it’s own personal data store.  There’s that commingling again. 

So if it was up to me, how would I resolve this situation?  Well, I see two possible scenarios. 

  1. Open up WorldCat.  OCLC likes to refer to WorldCat as the Library Commons — well, let’s treat it as such.  Remove the barriers for access and allow anyone and everyone the ability to essentially have their own copy of the Library Commons and it’s data.  Now, rather than specifying terms of transfer and telling libraries under what conditions they can and cannot make their metadata available to other groups, the membership could consider what type of Open Data license that the Commons could be made available under.  Something like the creative commons share alike license which allows for both commercial and non-commercial usage, but requires all parties to contribute all changes to the data back to the community (in essence, this is kind of what Open Library is doing with their metadata) may be appropriate.  OCLC would be free to develop their own products, but the rest of the library community (both library and vendor community) would have equal opportunity to develop new services and ways of visualizing the data found in the Commons.  Does this devalue the Commons (WorldCat)?  I don’t think so — look at Wikipedia.  It uses this model of distribution, yet I’ve never heard anyone say that this devalues it’s content.  Would there be challenges?  For sure.  Probably one of the biggest would be the way that it would change what it means to be a member of OCLC.  If each person could download their own personal copy of the Commons, would libraries stay members.  I’m certain that they would — but I’m sure that what it means to be a member would certainly change.
  2. Split OCLC’s membership services from OCLC’s vendor services.  Under this example, WorldCat Local/Navigator development would be spun away from OCLC as a separate business (this happens in academia all the time).  Were this to happen, OCLC would be able to develop terms for license that could then be leverage by all members of the commercial library community removing the artificial advantage OCLC is currently able to leverage (both in terms of data and deciding who is allowed to work with the Commons).  In all likelihood, I think that this model likely represents the smallest change for the membership and would continue to allow OCLC to make the Commons more available to non-commercial development without artificially limiting other groups interested in building new services. 

One last thought.  In talking to people today, I heard a number of times that OCLC restricting access to the Commons was in fact good thing, in part, because it finally allowed the library community the ability to leverage resources not available to the vendor communities.  In some way, we could finally stick it to them.  That’s fine, I’m all for developing tools and services, but this particular type of thinking I find worrisome.  If we, as a community, feel that we are unable to develop compelling tools and services that are able to compete with other vendor offerings without an artificial advantage — well that’s just sad and says a little something about how we see ourselves as a community.  And this too is something that I’d like to see change because if you look around, you will see that there are a myriad of projects (Koha, Evergreen, VuFind, Fedora, DSpace, LibraryFind, XC Catalog, Zotero, etc.) where developers (some library developers, some not) are re-envisioning how they see many of the services within the library and putting their time and effort into realizing those visions. 

 

–TR

 Posted by at 10:49 pm
Nov 022008
 

OCLC, today has made available their proposed new policy for the transfer of bibliographic records.  You can read the documents for yourself:

 

I took a sometime this morning to read through the proposed changes as well as the FAQs, and essentially OCLC is looking for a way to tell libraries that they don’t own the data that’s in their own catalogs.  In essence — this is what this policy comes down to.  The policy wraps some very nice changes for non-members into the statement in order to hide some really sucky changes that I don’t believe that they have the ability to ask for or enforce.  And OCLC has some real balls here, because starting in Feb., records downloaded from OCLC will potentially include a license statement.  Per the FAQ:

  • Prospectively. As of the effective date of the Policy, every record downloaded from WorldCat will automatically contain field 996 populated with the following:
    MARC:

    996 $aOCLCWCRUP $iUse and transfer of this record is governed by
    the OCLC® Policy for Use and Transfer of WorldCat® Records.
    $uhttp://purl.org/oclc/wcrup

    There is no need to add the 996 field to records by hand. OCLC systems will do this for you.

  • Retrospectively. For records that already exist in your local system, we encourage you to use the 996 field, which should have an explicit note like the examples below:
    MARC:
    996 $aOCLCWCRUP $iUse and transfer of this record is governed by
    the OCLC® Policy for Use and Transfer of WorldCat® Records.
    $uhttp://purl.org/oclc/wcrup
  •  

    You have got to be kidding me.  By adding this statement to the records in your catalog — you are essentially giving away your institutions ownership rights to your records (at least, this is how I read it).  This has wide implications — as how you use the data within your catalog would no longer be at your discretion.  And how does this affect the Library of Congress — because this license statement applies to all records with an OCLC number.  If LC puts an OCLC number in their records — well, by this agreement — they too will have to ensure that records made available from their catalog will only be used for non-commercial purposes.  And I wonder what happens to member libraries and remove this field after February, 2009.  I ask because I’d recommend any library that downloads records from OCLC to remove this statement from their records. 

    So what’s at stake long term here — the ability to develop network level discovery services (or in OCLC’s case, make sure that they are the only game in town).  Libraries need to remember that OCLC is both a company that created and supports a cooperative catalog as well as a burgeoning vendor trying to corner the library network discovery landscape.  And what OCLC has going for them is the WorldCat database that has been created by publicly funded institutions (like the Library of Congress, universities and public libraries).  In the past two years, OCLC has seen LibraryThing and the OpenLibrary begin to develop competing network-based services by having libraries contribute their own data back to these efforts.  By claiming exclusive rights to this community created metadata, OCLC is essentially monopolizing the network discovery landscape.

    In many ways, I see similarities to what OCLC is doing here to AT&T and Microsoft of the 90s.  AT&T was sued as a monopolist by the government because they essentially owned all the telephone lines and were able to squeeze out competition because of that ownership (even though, much of the infrastructure was created with public money on public land).  Likewise, Microsoft was sued as a monopolist because they were illegally stifling competition by utilizing API’s within the Windows operating system to create a competitive advantage for their software.  MS Office worked better on Windows than competing products because Microsoft wrote the OS — so they knew how to get the greatest level of performance.  This is what’s happening here — but at the metadata level.  OCLC has a large database of information that it has been entrusted by member organizations — and essentially OCLC is building a business unit around it.

    At this point, I see a lot of similarities between what Microsoft was doing in the 90′s and what OCLC is attempting to do now.  OCLC’s policy here is all about giving OCLC an unfair competitive advantage when it comes to creating the next generation cloud services — and anything that they say differently is a bunch of BS.  OCLC wants to claim that it owns the metadata created, in large part, by public funds by member institutions around the world.  And at this point, OCLC is very much becoming a predatory monopolistic organization, and I think as members, it might be time to tell OCLC that they need to be broken up.  This was done during the AT&T trial and was considered during the Microsoft Monopoly trial — but in order to level the playing field here — membership should consider insisting that OCLC decouple it’s work with WorldCat with the networked based services that it’s looking to build.  When AT&T was broken up, a number of new companies were born.  This is what needs to happen here.  And it’s something that needs to happen soon as the library community’s ability to develop the next generation of networked based services depends on it in my own opinion.

    –TR

    Updated: OCLC did post a revision to this document on Nov. 5th. There were some significant changes to the document. I’ve added my comments regarding this revision here: http://oregonstate.edu/~reeset/blog/ archives/582

     Posted by at 9:33 am