OCLC WorldCat Metadata API Ruby Gem

By reeset / On / In OCLC, ruby

Since last December, I’ve had the opportunity to spend a good deal of time working with the OCLC WorldCat Metadata API.  The focus was primarily around kicking the tires, and then eventually developing some integration components with MarcEdit, as well as a C# library (https://github.com/reeset/oclc_api) for those that may have use of such things.

However, most folks in libraries tend to not use C# – they tend to focus on lighter-weight languages like ruby, python, or PHP.  Well, I work in ruby as well, so I decided to take the C# library and port it to Ruby.  The resulting code can be found here: https://github.com/reeset/wc_metadata_api

It’s pretty straightforward – and provides wrappers for all the current functionality found in the API, with the caveat being that the defined transfer format between OCLC and the client is in MARCXML, and response formats are in application/atom+xml.  The API supports a handful of other formats, but I’ve standardized on MARCXML since it’s well understood and there are lots of tools that can work with it.

The code isn’t well commented at this point.  I’ll take some time over the next few days to add some commenting, improve exception handling, and possibly add a few helper functions to make message processing easier – but this should help anyone interested in knowing how the API works get a better understanding.


Code Example:

** Get Local Bib Record

require ‘rubygems’
require ‘wc_metadata_api’

key = ‘[your key]’
secret = ‘[your secret]’
principalid = ‘[your principal_id]’
principaldns = ‘[your principal_idns]’
schema = ‘LibraryOfCongress’
holdingLibraryCode='[your holding code]’
instSymbol = ‘[your oclc symbol]’

client = WC_METADATA_API::Client.new(:wskey => key, :secret => secret, :principalID => principalid, :principalDNS => principaldns, :debug =>false)

response = client.WorldCatReadLocalBibRecord(:oclcNumber =>’338544583′, :schema => schema, :holdingLibraryCode => holdingLibraryCode, :instSymbol => instSymbol)

puts response

WorldCat API gem moved to rubyforge

By reeset / On / In OCLC, rails, ruby

I had mentioned that I’d quickly developed a helper gem for simplifying using the WorldCat API in ruby (at least, it greatly simplifies using the API for my needs).  This was created in part for C4L and partly because I’m moving access from Z39.50 to the API when working with LF and I basically wanted a be able to do the search and interact with the results as a set of objects (rather and XML). 

Anyway, the first version of the gem (which is slightly different from the code first posted prior to C4L) can be found at the project page, here: http://rubyforge.org/projects/wcapi/


WorldCat API: draft ruby gem

By reeset / On / In OCLC, ruby

Since the WorldCat Grid API became available, I’ve spent odd periods of time playing with the services to see what kinds of things I could do and not do with the information being made available.  This was partly due to a desire to create an open source version of WorldCat Local.  It wouldn’t have been functionally complete (lacking facets, some data, etc), but it would be good enough, most likely, for folks wanting to switch to a catalog interface that utilized WorldCat as their source database, rather than their local ILS dbs.  And while the project was pretty much completed, the present service terms the are attached to the API make this type of work out of bounds at present (though, I knew that when I started – it was more experimental with the hope that by the time I was finished, the API would be more open). 

Anyway…as part of the process, I started working on some rudimentary components for ruby to make processing the OCLC provided data easier.  Essentially, creating a handful of convenience functions that would make it much easier to call and utilize that data the grid services provided – my initial component was called wcapi. 

Tomorrow (or this morning if you are in Providence), a few folks from OCLC will be putting on a preconference related to the OCLC API at Code4Lib.  There, folks from OCLC will be giving some background and insight into the development of API, as well as looking at how the API works and some sample code snippets.  I’d hoped to be there (I’m signed up), but an airline snafu has me stranded in transit until tomorrow – so by the time I get to Providence, all of, or most of the pre-conference will be over.  Too bad, I know.  Fortunately, there will be someone else from Oregon State in the audience, so I should be able to get some notes.

However, there’s no reason why I can’t participate, even while I?m jetting towards Providence – and so I submit my wcapi gem/code to the preconference folks to use however they see fit.  It doesn’t have a rakefile or docs, but once you install the gem, there is a test.rb file in the wcapi folder that demonstrates how to use all the functions.  The wrapper functions accept the same values as the API – and like the API, you need to have your OCLC developer’s key to make it work.  The wcapi gem will utilize libxml for most xml processing (if it’s on your machine, otherwise it will downgrade to rexml), save for a handful of functions that done use named namespaces within the resultset (this confuses libxml for some reason – though their is a way around it, I just didn’t have the time to add it). 

So, the gem/code can be found at: wcapi.tar.gz.  Since I’ll likely make use of this in the future, I’ll likely move this to source forge in the near future.

*******Example OpenSearch Call*******

require ‘rubygems’
require ‘wcapi’

client = WCAPI::Client.new :wskey => ‘[insert_your_own_key_here’

response = client.OpenSearch(:q=>’civil war’, :format=>’atom’, :start => ‘1’, :count => ’25’, :cformat => ‘mla’)

puts "Total Results: " + response.header["totalResults"]
response.records.each {|rec|
  puts "Title: " + rec[:title] + "\n"
  puts "URL: " + rec[:link] + "\n"
  puts "OCLC #: " + rec[:id] + "\n"
  puts "Description: " + rec[:summary] + "\n"
  puts "citation: " + rec[:citation] + "\n"


********End Sample Call****************

And that’s it.  Pretty straightforward.  So, I hope that everyone has a great pre-conference.  I wish that I was there (and with some luck, I may catch the 2nd half) – but if not, I’ll see everyone in Providence Monday evening.


LibraryFind next week

By reeset / On / In LibraryFind, ruby

As I’ve been working on 0.9, I’ve been trying to migrate few odds and ends into the current 0.8 branch so that I can move them into production faster on our end.  To that end, I’ll be posting an updated to LF by the beginning of next week.  These updates will include:

  1. Update to the Harvester (this will make it a bit more fault tolerant) as well as allowing harvesting of sets (currently) and root level oai providers (not provided currently).  This change required significant changes to the search component as well that deals with the harvested materials.
  2. Auto-detection of namespaces (for oai and sru — needed for libxml)
  3. Removed rexml dependencies for opensearch component
  4. Frozen gems for oai, sru and opensearch into vendor directories (and have added that to the environment.rb file)
  5. Prep code for solr/ferret decision.  I’ll be adding support to use either ferret or solr as your backend indexer for harvesting for 0.9, but some changes are being made to make this easier.  Ferret provides an integrated rails solutions, while solr would provide a hosted index option.
  6. In addition to this, some changes to the libxml module have deprecated a call being used in the oai gem (maybe the sru gem).  I’ll take a look at both this week and update appropriately [as well as keep backworks compatibility if possible]

Something else, we are starting to work with mod_rails.  This allows apache to manage the rails environment — eliminating the need to run rails through packs of mongrels (or other specialized serving mechanism).  I’ll write up something on our experiences for others that might be interested in this approach.


oai and sru gems updated

By reeset / On / In rails, ruby

In wrapping up the Libraryfind update, I’ve had to make some modifications to a few gem packages.  These are the ones updated:

  1. oai: http://rubyforge.org/projects/oai/
    Prior versions of the oai component only supported the ruby libxml gem 0.3.8-.  In order to support the newer versions of the ruby-libxml (which does better memory management), I’ve updated the oai gem that will support both the older and new ruby-libxml apis.  The primary changes are related to how it handles properties/attributes and the way that the oai provider class handles dates (to support OAI 2.0 validation).
  2. sru-ruby: http://rubyforge.org/projects/sru/
    Added libxml support.  For queries that return one or two records, the different between using REXML or Libxml as your parser will be hardly noticable.  However, when queries result in larger datasets, the difference will be measured in terms of seconds.  For example, in some tests, recordsets returning 50 records in MARCXML format could run close to 10 seconds using REXML.  On the other hand, the same search using libxml processes in 0.8 seconds.  For tools, like LibraryFind, which deal with larger datasets, this is a welcome change.



XSLT and Ruby/Rails

By reeset / On / In ruby

While adding REST support for Libraryfind, I found that I wanted to provide an output in XML, but that could also provide HTML if an XSLT was attached.  In Rails, generating XML files is actually pretty easy.  In Rails, output is specified in views.  HTML views are created using a .rhtml extension, while xml views are created using a .rxml extension (at least, until Rails 2.0 when they are to change). 

Anyway, we use libxml for XML processing of large XML documents (since I just have never found REXML up to the task) and was happy to find a gem based on libxml that provides XSLT support as well.  The libxslt-ruby gem provides a very simplified method for doing XSLT processing in ruby.  In the .rxml file, I add the following code:

if params[:xslt] != nil
@headers["Content-Type"] = "text/html"

require 'xml/libxml'
require 'xml/libxslt'

xslt = XML::XSLT.file(params[:xslt])
xslt.doc = XML::Parser.string(_lxml).parse

# Parse to create a stylesheet, then apply.
s = xslt.parse
return s.to_s
return _lxml

Simple and no fuss.  The XML::XSLT.file function can take a physical or remote file location.  To parse the file, you must pass into XML::XSLT.doc a reference to a XML::Document object.  Since libxml doesn’t provide a function to deal with xml data directly (it assumes that you are loading XML via a file), you simply need to use the included XML Parser object to create an XML::Document object dynamically.  Anyway, I’m finding that this works really well.


LibraryFind installation: dealing with problems relating to openssl and rubygems

By reeset / On / In ruby

I was installing LibraryFind on a server at Willamette University the other day for testing purposes, and ran into something that I had never seen before.  While setting up the dependencies on the test server, I found that the current version of ruby found in the distro’s YUM repository was old (1.8.5), so I decided to download and compile ruby from source.  So, here’s the steps that I followed:

  1. Downloaded Ruby 1.8.6 (current patchset)
  2. Compiled Ruby (no errors)
  3. Downloaded Rubygems
  4. Ran setup…which was successful
  5. Using Rubygem, I tried to install Rails and this is where I ran into problems.  The download starts and then throws the following error:
    ERROR:  While executing gem … (Gem::Exception)
        SSL is not installed on this system

So, I made sure openssl and the openssl-devel packages were installed on the machine.  Well, they were.  After digging around on the web, I couldn’t find anything that helped — however, I did find an email message from a long time back when we were compiling ruby to work with mysql and had to compile from source.  To make it work, we had to go into the ruby ext folder in the source and compile some files directly.  So, I figured I’d give it a try for openssl and it worked.  So, here’s the steps I followed:

  1. Navigate to ext/openssl in the ruby source folder.
  2. Once there, you run the following:
  3. ruby extconf.rb
    make install

  4. After running the install, you can ensure that ruby can “see” the openssl information by running the following from the command-prompt:
    >>require ‘openssl’
    If everything is setup right, you will see the following: =>true.



LibraryFind 0.8.5: threading and XSLT and REST, Oh my

By reeset / On / In LibraryFind, ruby

At present, I’m wrapping up the back-end changes to what will be LibraryFind 0.8.5.  Yup, we’ll be skipping 0.8.4 in part because I’d like the release point to represent the broadness of the changes being made.  In fact, had the UI portions of the code been modified to completely support the new back-end searching, we’d likely jump the release point to 0.9. 

So what changes are coming in the back-end side of LibraryFind:

Minor Changes:

  1. SRU support: I’ve added a class to support SRU searching
  2. Connector class refactoring — I’ve made these a little more abstract and created them as a model.  Should continue to simplify the creation of new connector types when necessary.
  3. OpenURL resolution becoming optional:  While LibraryFind has been designed to integrate directly with some OpenURL services to provide inline resolution of items prior to showing them to the users — this does add a barrier for folks wanting to test.  So, I’ve made this optional.  If you don’t enable OpenURL resolution, it will just generate links to the resource if possible.
  4. Finally gotten rid of a warning message that was being thrown each time I processed a collection of Record objects.  When the value of the record objects property was nil, it would throw a warning.  It was causing any problems with rendering or searching, but it was filling the error logs with unnecessary information.
  5. Other optimizations (see trac for a full list)

Major Changes:

  1. Expanding the current API to include an asynchronous set of searching tools.  Currently, LibraryFind utilizes Ruby’s Thread classes to initialize the various searches that it needs to run to retrieve a set of results.  Unfortunately, while this process does thread the various searches, it still requires a primary thread that holds an HTTP connection open until all the threads have finished running.  So, in an effort to make searching more user friendly (and faster), I’ve made use of the spawn plug-in.  This was nice because it supports both ruby’s threads as well as forking of processes.  Add a job queue, some additional models and there you have it.  The 0.8.5 UI has been refactored to make use of the new API — but you will not see any changes to how the results are presented.  In the future, this will change.
  2. REST API.  LibraryFind has supported the full SOAP stack since its inception.  This is in part because I prefer working with SOAP and it worked.  Well, in order to support more light-weight search clients against LibraryFind, I’ve started adding a REST-based protocol.  At present, the only API method currently emulated is the search function.  However, I’m planning on including support for our Asynchronous search and Collection related API.  Hopefully, this will provide one more point for integration.
  3. XSLT support:  In addition to providing a REST based API, these queries will also accept an XSLT file which can be used to transform data on the fly.  This will allow for the development of UIs through the simple creation of an XSLT file. 

Anyway, that’s what’s on the menu for 0.8.5 at the moment.  My guess is that we’ll be doing a feature freeze shortly and start the quality control process.  So things are starting to move quickly.


LibraryFind and Mobile Services

By reeset / On / In Digital Libraries, LibraryFind, ruby

One of the things I was really impressed with while attending DLF was the presentation on the lightweight web platform being built at NCSU.  Leveraging their endeca catalog, the folks at NCSU have been able to produce a set of REST-based api for querying the catalog.  With those services, they’ve designed a mobile interface and a google widgets interface to their catalog.  It’s a good idea — one that I’m working on moving into LF.  LibraryFind already supports a SOAP-based API (which is actually my preference), but I’m running into more and more cases where a light-weight rest-based api would be nice.  Fortunately, ruby/rails makes this possible. 

Anyway, in the spirit of looking at building client services ontop of libraryfind, I’ve created a quick facebook app using libraryfind.  Whether we’ll actually use it (or be able to contribute it to the facebook app list — who knows), but it was actually pretty simple to put together.  Here’s a couple of quick screenshots (you’ll notice I’m using the iframe option — mostly because I was getting annoyed while testing the app getting dropped out of facebook.  This way, the user stays in facebook, but can query the service.

Facebook LibraryFind app Front:


Facebook LibraryFind Search/Results:



At this point, the app is just rendering LF in the iframe, though I imagine that an interface developed for mobile users would also work best within this environment.  I guess time (and usability testing) will tell (if there is even any interest in having something like this at all).


Technorati Tags: ,

LibraryFind 0.8.4 upcoming changes

By reeset / On / In LibraryFind, ruby

At some point, I’ll likely move this to the LibraryFind blog.  I just realized that I couldn’t remember my login information to post to the blog — so, I’ll post here.

I’m not exactly sure if the UI changes will be made in 0.8.4 to incorporate the new spawning/pinging (I think that they will), but the next version of LibraryFind will include an extended set of API that will allow users to develop UI interfaces that allow searches to be done independent of the UI.  This was done using a spawn plugin (you can find it on rubyforge or linked in lf via svn:externals), the creation of a jobs queue, and changes in a lot of the backend server components to make this all work.  I’ve just checked into the 0.8.4-ping branch (and soon into trunk once I’ve finished testing and create some unit tests) all the code necessary to make this work.  This requires the following changes to the LF codebase:

  1. meta_search.rb refactor:  In order for this to work, I’ve had to move the cache checking code out of this file and move it into a more abstracted “connector” class.  The connector classes needed to be modified to support the new pinging functionality

  2. New models: job_query.rb, job_item.rb, cache_search_session.rb, cached_search.rb, oai_search_class.rb, z3950_search_class.rb, opensearch_search_class.rb

  3. API file changes: query_api.rb

  4. Controller changes: meta_search.rb, query_conntroller.rb, dispatch.rb, record_set.rb

  5. Migration changes: New migration has been added — 025_create_job_queues.rb

  6. Changes to the environment.rb file (these will be in the environment.rb.example file)

New SOAP API elements:

  • GetJobRecord (Queries the result set for a single job, returns the array of records)
  • GetJobsRecords(Queries the record sets for an array of jobs, returns the array of records)
  • SearchAsync (Returns an array of job ids, works like search)
  • SimpleSearchAsync (Returns an array of job ids, works like simple search)
  • CheckJobStatus (Returns a JobItem object for a single job)
  • CheckJobsStatus (Returns an array of JobItem objects for an array of jobs)

The presumed workflow with the api is as follows:

  1. Client will utilize the SearchAsync (or SimpleSearchAsync) API to initiate the search.  This will return an array of job ids.
  2. Client will periodically ping the server, using CheckJobStatus (or CheckJobsStatus) for a list of jobs and their status.  Status is an enumeration.  -1 == error, 0 == finished/idle, 1 == searching
  3. Client will retrieve records once items have been processed by calling GetJobRecord (or GetJobsRecords) for the array of records.

With this finished, I’ve completed pretty much all the current tags waiting for 0.8.4 and will be focusing on one additional enhancement that I’d like to see in 0.8.4 (if we can finish and test) or 0.8.5 — that being the addition of a REST-based XML api as well.  While I actually prefer the SOAP stack for some development, I would like to provide a REST-based api for more lightweight clients.

Anyway — that’s the news for now.