Nov 052013
 

*******************************************************************************************************************************************************

I wanted to note that I’ve updated this post to correct/clarify two statements within this post. 

  1. The requirement of 2 wskeys
  2. Terms of use

OCLC has two wskey structures.  For those developers that have been working with OCLC for a long time and have a wskey for their search services, OCLC can decommission your former key and create a new one that supports all functionality or they can give you a second key for the Metadata API.  For new users, you simply need to request a key that includes specific functionality.  In MarcEdit, I will continue to keep the key’s separate for configuration purposes, but this value can be the same.

Secondly – related to terms of use.  I need to clarify – if you are developer and have been using the WorldCat Platform Services outside of the Search API, the terms to use this service as a developer are no different from the other licensing terms you or your institution may have agreed to.  However, if you are a cataloger, the terms to retrieve/create a developer key are very likely different from the terms of service associated with your license related to the cataloging service.  Users need to be aware of these differences because your organization may/will care.

*******************************************************************************************************************************************************

I’ve been starting to work on a couple of different write-ups around working with the OCLC Metadata API (http://oclc.org/developer/services/worldcat-metadata-api) – my general impressions and some specific improvements that really need to be in place to make this resource more usable.  But I also have realized that I have neglected to give any kind of write-up about using the API within MarcEdit, save for an early YouTube video.  So, I’m taking a little bit of time here to jot down some information related to how the API can be utilized within MarcEdit.

Background:

So a little bit of background.  Over the past couple of years, I’ve been looking for opportunities to make use of the great work that OCLC Research does in exposing some interesting new data streams to the library community.  When OCLC first released their classify service a number of years ago, I believe I was probably one of the first people to grab it and integrate it into a product that could be used within existing cataloging workflows.  This year, I expanded the service to include integration with OCLC’s FAST headings service.  While OCLC’s services do have some baggage attached (in terms of who is allowed to use them), the baggage is pretty small and they provide real value to the library community.  Providing them through MarcEdit made a lot of sense.

The same can be true of OCLC’s new Metadata API.  For a number of years, I’ve been having individuals in the user community looking for the ability to interact directly with WorldCat, specifically around the ability to set and delete holdings.  The API provides this functionality, in addition to the ability to add and edit bibliographic records, as well as functionality around local data records.  For the first round of integration, I’ve limited my work primarily to dealing with holdings and bibliographic data.  I’ll be looking for people interested in the local data functionality and see how MarcEdit might be augmented to support that as well.

Early Use cases

Part of the reason I chose to work and support the specific API actions that I did was due to existing use cases.  Processing around E-books and E-Resources have lead users within the MarcEdit community to make a couple common requests related to OCLC and WorldCat.  Specifically, users are asking to:

  1. Automate the process of setting holdings
  2. Automate the upload of new records without using Connexion
  3. Automate Headings validation

The API provides the ability to address the first two questions, and I hope with time, OCLC will make some of their heading validation services available to support the 3rd request.  But for now, MarcEdit’s development has really been focused around supporting these two specific use cases.

First Impressions

I’ll take some time to provide some additional feedback in a few days, but my first impressions were mixed.  One the one hand, interacting with the service once I understood what the API expected in terms of requests and responses was pretty straightforward.  On the other hand, very little documentation exists.  Often times, I would initiate purposeful errors because simple things, like acceptable schemas, are left undocumented but are provided in an error message.  Unfortunately, the missing documentation definitely complicates working with the service, as things related to validation, reasons for authentication errors, etc. simply are not documented and whose descriptions are fairly cryptic when reading as a response from the API.

However, I think anyone that does development probably is use to working with scarce documentation, and to OCLC’s credit, they have been very responsive in providing answers to the holes that currently exist in the documentation.  From my perspective, the most concerning aspect of the API right now is the authentication.  From a developer’s standpoint, the authentication process is easy enough.  For the user however, the process for getting keys to utilize tools built upon the services is fairly problematic because the process assumes that users are developers and forces key requests and management through that portal.  Likewise, these keys come with new terms of use outside of your traditional cataloging licensing and may (likely) will need to be vetted through an organizations legal council.  Again, this is primarily a problem for non-developers – who have relationships with OCLC in ways outside of the WorldShare Platform Services…but these are primarily the users that this integration  in MarcEdit targets (i.e., catalogers, not developers).  This honestly is my biggest concern with the service right now (that and that keys are tied to individuals, not institutions) – bigger than issues related to documentation, validation, and the somewhat cryptic nature of the feedback received through the API.

Using the API in MarcEdit

So, to use the OCLC Metadata API in MarcEdit, there are a couple things that you need to do.  First, you need to request your OCLC keys.  MarcEdit’s integration requires users the appropriate Wskeys:

  1. OCLC’s Search API Key
  2. OCLC’s Metadata API Key

For long-term OCLC Platform Users (think Search API) – this means requesting two keys due to the fact that the previous key format isn’t compatible with their new authentication system.  For new users, a single key should suffice.  Regardless, a key that can support OCLC’s Search functionality is required to support MarcEdit’s ability to query the WorldCat database.

Likewise, MarcEdit needs a key that can utilize the Metadata API key.  Again, for legacy API users, this will likely be a separate key – for new users – the search and metadata keys should be the same.  This key has two parts – there is the Key and a Secret key.  You need both to make requests.  Likewise, there are 3 other values that users need to know to utilize the Metadata API services, and that is what is called a principalID a principalIDNS, a schema, a holdings symbol and a 4 character holdings code.  All of these values need to be available in order to work with the various functions provided by the API.

Once you have these keys and supplemental information, you need to enter them into MarcEdit.  Users will be able to see the menu entries for many of the OCLC functions, but until a user has entered their OCLC key data into the MarcEdit preferences and validated the data – these functions will remain disabled.

In the Preference’s area, there is a new tab that has been setup to store data related to OCLC’s Metadata services.

image

From this screen, users can enter all the information that they will need in order to utilize the various Metadata and Search API functions.  MarcEdit will store this information in it’s configuration file, but does encrypt the data using a methodology that would make it impractical to reconstruct the key data.

After a user enters their data – the user should Click Apply, and then Validate.  The Validate process is what enables the OCLC API integration functions.  MarcEdit performs a couple API operations on an existing test record to determine if the information that user provided will support the required functionality.  As MarcEdit validates a key, it will turn the textbox either green (for validated) or red (to indicate a problem).

image
Example of a problem with a key

image
Example of all keys validated

Once the data has been validated – the OCLC Menu Items found in the Main Window and on the MarcEditor will be enabled.

image
Main Window – OCLC Functions

image
MarcEditor OCLC Functions

Using the OCLC API Functions

Hopefully, the functions are fairly straightforward to use and look identical regardless of where you interact with them within the application.  At present, MarcEdit provides a function for setting holdings and working with bibliographic data.

Setting Holdings

image
MarcEdit’s Holdings Tool

MarcEdit provides a straightforward tool for updating a user’s holdings for a batch of records.  MarcEdit’s holdings tool can process either MARC data, MarcEdit’s mnemonic data or a plain text file of OCLC numbers.  The tool does exactly what it says.  Using the information set in the Preferences, MarcEdit will pre-populate your Institution Code and Classification Schema.  Users then need to select an action.  MarcEdit’s batch tool can update or delete holdings, but will perform just one of those actions on all records within a designated file.  That means, you cannot upload a file that contains records that need to have holdings added and deleted.  The tool requires that these actions are isolated into discrete operations.

Update/Creating Bibliographic Data

image
MarcEdit’s Bibliographic Tool

When working with MarcEdit’s bibliographic updating/creation tool for WorldCat – this tool does allow users to work with files that contain records that are both new or updated.  OCLC’s system and MarcEdit evaluates records within a file and based on the presence or absence of an OCLC number in a record will determine if a bibliographic record is to be created or updated.  Of course, it’s not that simple – all bibliographic records passed through the API must be validated on the OCLC side, and at this point, that validation only occurs when the data is submitted for update or creation – a weakness that I hope at some point will be rectified since this causes a number of issues when working in a batch record environment.

Working within the MarcEditor

When working within the MarcEditor, the tools for working with Holdings and Bibliographic data are identical to outside.  The main difference is that the data being acted upon is the data in the MarcEditor.  This means that the MarcEditor needs to include one additional function, and that is the ability to retrieve data from WorldCat.  MarcEdit has the ability to search and download a record set from WorldCat for editing.

image
Searching WorldCat from Within the MarcEditor

The search tool provides a handy way for users to quickly retrieve data from WorldCat for edit.  However, this too underlines one of the glaring issues with the API – especially around record editing.  Traditionally, when catalogers work on a record, they can lock the record for editing.  This prevents other users from changing the record while they are working on it, and protects the transaction information in the 005 which OCLC requires for updating purposes.  When working with the API, there appears to be no way to lock a record.  This means that records can be downloaded, edited and then on upload, be invalid due to someone else making an edit that updates the 005 information.  And since OCLC doesn’t provide a method for validating information prior to upload, users won’t know that this problem has occurred until an upload is attempted.  Again, in order to make the API functionally equivalent to OCLC’s other editing services, there needs to be a way for catalogers to lock records for editing.

Where we go from here

This is really hard to say.  I believe that OCLC sees the release of the Metadata APIs as the first of a much larger portfolio of API services to provide programmatic support for their WorldShare endeavors.  So, it will be interesting to see how these develop.  At the same time, I’m really interested in seeing if the organization puts the type of resources behind the development of their API Services Portfolio to provide really meaningful services that librarians, developers, and 3-party library developers can work with, consume, or augment the wide range of data streams that they have available.  On the one hand, I’m hopeful that the release of the Metadata API signals are wider commitment to providing transparent access to a wide variety of services.  At the same time, the lack of development around the OCLC search API – which has seen only incremental changes since they were first released some 5 or so years ago – makes me wonder how much support within the larger organization exists around this type of work.  So I guess that is a question that remains to be answered.

 

–TR

 Posted by at 1:32 pm
Oct 102013
 

I thought I’d take a quick moment to highlight some work that was done by one of the programmers here at The OSU, Peter Dietz.  Peter is a bit of a DSpace wiz and a contributor to the project, and one of the things that he’s been interested in working on has been the development of a REST API for DSpace.  You can see the notes on his work on this GitHub pull request: https://github.com/DSpace/DSpace/pull/323.

<sidenote>

Thankfully, I’m at a point in my career where I no longer have to be the individual that has to wrestle with DSpace’s UI development, but I’ve never been a big fan of it.  From the days when the interface was primarily JSP to the, it sounded like a good idea at the time, XSLT interfaces that most people use today, I’ve long pined for the ability to separate the DSpace interface development from the actual application, and move that development into a framework environment (any framework environment).  However, the lack of a mature REST API has made this type of separation very difficult. 

</sidenote>

The work that Peter has done introduces a simple READ API into the DSpace environment.  A good deal more work would need to be done around authentication to manage access to non-public materials as well as expansions to the API around search, etc., but I think that this work represents a good first step. 

However, what’s even more exciting is the demonstration applications that Peter has written to test the API.  The primary client that he’s used to test his implementation is a Google Play application, which was developed utilizing a MVC framework.  While a very, very simple client, it’s a great first step I think that shows some of the benefits of separating the interface development away from the core repository functionality, as changes related to the API or development around the API no longer require recompiling the entire repository stack. 

Anyway – Peter’s work and his notes can be found as part of this GitHub pull request.  https://github.com/DSpace/DSpace/pull/323.  Here’s hoping that either through Peter’s work, or new development, or a combination of the two; we will see the inclusion of a REST API in the next release of the DSpace application.

–tr

 Posted by at 9:53 pm
Apr 252012
 

One of the hats I wear is as a member of the Independence Library Board.  I love it because I don’t work with public libraries as often as I’d like to in my real job, and honestly, the Independence Public Library is the center of the community.  The Library is a center for adults looking for education opportunities, kids looking for resources, and home to a number of talented librarians that are dedicated to encouraging a love of reading to our community.  It’s one of the few libraries I’ve ever known to have both a children’s and adult reading programs and takes advantage of that in the summer – by having the adults and kids compete against each other to see who logs the most pages (the kids always win). 

Each board meeting is interesting, because as the economy became more difficult for people, more people turned to the library.  Every month, the library sees more circulations, more bodies in the building, more kids, more adults – just more.  And they do it on a budget that doesn’t accurately reflect the impact that they have on the community. 

Anyway, one of the things that the Library has going for it is a very active friends program – and through that group (and some grant funds), the library was able to purchase a number of Laptop computers for circulation within the Library.  The Library currently has some, 8-10 terminals that are always being used and the laptops would provide additional seats, and allow people to work anywhere within the library using the wifi.

The Library setup the laptops using the usual software – DeepFreeze, etc. to provide a fairly locked down environment.  However, what was missing was a customizable timer on the machines.  Essentially, the staff was looking for a way to make it easier for patrons checking out the laptops to avoid fines.  The Laptops circulate for a finite period of time within the building.  Once that time is over, the clock starts ticking for fines.  To avoid confusion, and help make it easier for patrons to know when the clock was running out – I’d offered to work on building a simplified timer/kiosk program. 

The impetus for this work comes from Access 2007 I think.  I had attended the hackfest before the conference and one of the project ideas was an open source timing program.  I had worked on and developed a proof of concept that I passed on.  And while I never worked on the code since – I kept a copy myself.  When we were talking about things that would be helpful, I was reminded of this work. 

Now, unfortunately, I couldn’t use much of the old project at all.  The needs were slightly different – but it helped me have a place to start so that I wasn’t just looking at a blank screen.  So, with idea in hand, I decided to see how much time it would take to whip together an application that could meet the needs. 

I’ll admit, nights like tonight make me happy that I still do more than write code in scripting languages like python and ruby.  Taking about 3 hours, I put together a feature complete application that meets our specific needs.  I’ll be at the Oregon Library Association meeting this week, and if folks find this kind of work interesting, I’ll make it a bit more generic and post the source for anyone that wants to tinker with it.

So what does it do?  It’s pretty simple.  Basically, it’s an application that keeps time for the user and provides some built-in kiosk functionality to prevent the application was being disabled. 

Here are a few of the screen shots:

image
When the program is running, you see the clock situated in the task tray

image
Click on the icon, and see the program menu

Preferences – password protected

image

image

image
Because we have a large Hispanic population, all the strings will need be able to be translated.  This was essentially is just the locked message.  I’ll ensure the others are customizable as well – maybe with an option to just use Google Translate (even though it far, far from perfect) if a need to just get the gist across is the most important.

image
Run an action (both functions require a password)

image
Place your cursor over the icon to get the minutes

image
Information box letting you know you are running out of time

image
Sample lockout screen

In order to run any of the functions, you must authenticate yourself.  In order to disable the lockout screen, you must authenticate yourself.  What’s more, while the program is running, it creates a low-level keyboard hook to capture and pre-process all keystrokes, disabling things like escape keys, the windows key, ctrl+alt+del so that once this screen comes up – a user can not break out of it without shutting off the computer (which would result in needing to log in).  Coupled with DeepFreeze and some group policy settings, my guess is that this will suffice.

The source code itself is a few thousand lines of code, with maybe a 1000 or 1500 lines of actual business logic and the remainder around the UI/threading components.  Short and simple.

Hopefully, I’ll get a chance to do a little testing and get some feedback later this week – but for now – I’m just happy that maybe I can give a little bit back to the community library that gives so much to my family.  And if I hear from anyone that this might be of interest outside my library – I’ll certainly post the code up to github.

–TR

 Posted by at 12:48 am
Mar 082011
 

When working with the 64-bit flavor of Windows, there are a couple of quirks that you just need to accept.  First, when Microsoft designed windows for 64 bit processing, they weren’t going to break legacy applications and second, how this gets done makes absolutely no sense unless you simply have faith that the Operating System will take care of it all for you.

So why doesn’t it make sense?  Well, Windows 64-bit systems are essentially operating systems with two minds.  One the one hand, you have the system designed to run 64-bit processes, but lives within an ecosystem where nearly all windows applications are still designed for 32-bit systems.  So what is an OS designer to do?  Well, you have both versions of the OS running, obviously.  Really, it’s not that simple (or complex) – in reality, Microsoft has a 32 bit emulator that allows all 32 bit applications to run on 64-bit versions of Windows.  However, this is where it gets complicated.  Since 64 bit processes cannot access 32 bit processes and vise versa – you run into a scenerio where Windows must mirror a number of systems components for both 32 and 64 bit processes.  They do this two ways:

  1. In the registry – Microsoft has what I like to think of as a shadow registry that exists for WOW64 bit processes (that’s the 32 bit emulator if you can’t tell) – to register COM objects and components accessible to 32-bit applications.
  2. The presence of a system32 (this is ironically where the 64bit libraries live) and a SysWow64 (this is where the 32 bit libraries live) system folders which replicate large portions of the Windows API and provide system components for 32-bit and 64 bit processes.

So how does this all work together?  Well, Microsoft makes it work through redirection.  Quietly, transparently – the Windows operating system will access the applicable part of the system based on the process type (being either 64 or 32 bit).  The problem arises when one is running a 32-bit process, but want to have access to a dedicated 64 bit folder or registry key.  Because redirection happens at the system level, programming tools, api access, etc. all are redirected to the appropriate folder for the process type.

Presently, I’m in the process of rebuilding MarcEdit, and one of the requirements is that it run natively in either 32 or 64 bit mode.  The problem is that there are a large cohort of users that are working with MarcEdit on 64 bit systems, but have it installed as a 32 bit app.  So, I’ve been tweaking an Installer class that will evaluate the user’s environment (if 32 bit process running in a 64 bit OS) and move the appropriate 64 bit files into their correct locations so that when they run the program for the first time, the C# code compiled to run on Any CPU (so, if it’s a 64 bit OS, it will run natively as a 64 bit app) – the necessary components will be available.

So how do we do this?  Actually, it’s not all that hard.  Microsoft provides two important API for the task: Wow64DisableWow64FsRedirection and Wow64RevertWow64FsRedirection.  Using these two functions, you can temporarily disable redirection within your application.  However, you want to make sure you re-enable when your required access of these components is over (or apparently, the world as we know it will come to and end).   So, here’s my simple example of how this might work:

#region API_To_Disable_File_Redirection_On_64bit_Machines
public static string gslash = System.IO.Path.DirectorySeparatorChar.ToString();
[DllImport("kernel32.dll", SetLastError = true)]
private static extern bool Wow64DisableWow64FsRedirection(ref IntPtr ptr);
[DllImport("kernel32.dll", SetLastError = true)]
private static extern bool Wow64RevertWow64FsRedirection(IntPtr ptr);
#endregion
#region Fix 64bit 
if (System.Environment.Is64BitProcess == false && System.Environment.Is64BitOperatingSystem == true)
{
//This needs to have some data elements moved from the 64bit folder to the System32 directory
try
{
string[] files = System.IO.Directory.GetFiles(AppPath() + "64bit" + gslash);
string windir = Environment.ExpandEnvironmentVariables("%windir%");
string system32dir = Path.Combine(windir, "System32");
if (system32dir.EndsWith(gslash) == false) {
system32dir += gslash;
}

//We need to run off the redirection that happens on 64 bit systems.
IntPtr ptr = new IntPtr();
bool isWow64FsRedirectionDisabled = Wow64DisableWow64FsRedirection(ref ptr);
foreach (string f in files)
{
try
{
string filename = System.IO.Path.GetFileName(f);
if (System.IO.File.Exists(system32dir + filename) == false)
{
    System.IO.File.Copy(AppPath() + "64bit" + gslash + filename, system32dir + filename);
}
}
catch (System.Exception yyy){//System.Windows.Forms.MessageBox.Show(yyy.ToString()); 
}
}
//We need to run the redirection back on
if (isWow64FsRedirectionDisabled)
{
bool isWow64FsRedirectionReverted = Wow64RevertWow64FsRedirection(ptr);
}

}
catch
{}
}
#endregion

 

And that’s it.  Pretty straightforward.  As noted above, the big thing to remember is to Reenable the Redirection.  If you don’t, lord knows what will happen.

 

–TR

 Posted by at 6:04 pm
Mar 022011
 

One of the questions that consistently comes up with the advent of Windows Vista and Windows 7’s use of the UAC is how to run applications or processes without being prompted for a username/password.  There are a number of places online that talk about how to use the C# classes + LogUser API to impersonate a user.  In fact, there is a very nice class written here that makes it quite easy.   However, these options don’t work for everything – and one specific use case is during installation.  So, for example, you’ve written a program and you want to provide automatic updating through the application.  To do the update, you’d like to simply shell to an MSI.  Well, there is a simple way to do this.  Using the System.Diagnostics assembly you can initiate a new process.  So, for example:

System.Security.SecureString password = 
new System.Security.SecureString(); string unsecured_pass = "your-password"; for (int x = 0; x < unsecured_pass.Length; x++) { password.AppendChar(unsecured_pass[x]); } password.MakeReadOnly(); StringBuilder shortPath = new StringBuilder(255); GetShortPathName(@?c:\your file name.msi?, shortPath,
shortPath.Capacity); System.Diagnostics.ProcessStartInfo myProcess =
new System.Diagnostics.ProcessStartInfo(); myProcess.Domain = "my-domain"; myProcess.UserName = "my-admin"; myProcess.Password = password; myProcess.FileName = "msiexec"; myProcess.Arguments = "/i " + shortPath.ToString(); myProcess.UseShellExecute = false; System.Diagnostics.Process.Start(myProcess);

So, using the above code, a program running as a standard user, could initiate a call to an MSI file and run it as though you were an administrator.  So, why does this work?  First, in order for this type of elevation to work, you need to make sure that UseShellExecute is set to fall.  However, if you set it to fall, you cannot execute any process that isn’t an executable (and a .msi isn’t an executable).  So we solve this by calling the msiexec program instead.  When you run an msi installer, this is the parent application that gets called.  So, we call it directly.  Now, since we are calling an executable, we can attach a username/password/domain to the process we are about to spawn.  This allows use to start the program as if we were another user.  And finally, when setting the path to the MSI file we want to run, we need to remember that we have to call that file using the Windows ShortFile Naming convention.  There is an API that will allow you to accomplish this. 

If you have this setup correctly – when the user runs an MSI file, they will not be prompted for a username or password – however, they will still see a UAC prompt asking to continue with the installation.  It looks like this (sorry, had to take a picture):

image

So, while the user no longer has to enter credentials to run the installer, they do have to still acknowledge that they wish to run the process.  But that seems like a livable trade off.

–TR

 Posted by at 2:05 pm
Mar 132009
 

I had mentioned that I’d quickly developed a helper gem for simplifying using the WorldCat API in ruby (at least, it greatly simplifies using the API for my needs).  This was created in part for C4L and partly because I’m moving access from Z39.50 to the API when working with LF and I basically wanted a be able to do the search and interact with the results as a set of objects (rather and XML). 

Anyway, the first version of the gem (which is slightly different from the code first posted prior to C4L) can be found at the project page, here: http://rubyforge.org/projects/wcapi/

–TR

Feb 222009
 

Since the WorldCat Grid API became available, I’ve spent odd periods of time playing with the services to see what kinds of things I could do and not do with the information being made available.  This was partly due to a desire to create an open source version of WorldCat Local.  It wouldn’t have been functionally complete (lacking facets, some data, etc), but it would be good enough, most likely, for folks wanting to switch to a catalog interface that utilized WorldCat as their source database, rather than their local ILS dbs.  And while the project was pretty much completed, the present service terms the are attached to the API make this type of work out of bounds at present (though, I knew that when I started – it was more experimental with the hope that by the time I was finished, the API would be more open). 

Anyway…as part of the process, I started working on some rudimentary components for ruby to make processing the OCLC provided data easier.  Essentially, creating a handful of convenience functions that would make it much easier to call and utilize that data the grid services provided – my initial component was called wcapi. 

Tomorrow (or this morning if you are in Providence), a few folks from OCLC will be putting on a preconference related to the OCLC API at Code4Lib.  There, folks from OCLC will be giving some background and insight into the development of API, as well as looking at how the API works and some sample code snippets.  I’d hoped to be there (I’m signed up), but an airline snafu has me stranded in transit until tomorrow – so by the time I get to Providence, all of, or most of the pre-conference will be over.  Too bad, I know.  Fortunately, there will be someone else from Oregon State in the audience, so I should be able to get some notes.

However, there’s no reason why I can’t participate, even while I?m jetting towards Providence – and so I submit my wcapi gem/code to the preconference folks to use however they see fit.  It doesn’t have a rakefile or docs, but once you install the gem, there is a test.rb file in the wcapi folder that demonstrates how to use all the functions.  The wrapper functions accept the same values as the API – and like the API, you need to have your OCLC developer’s key to make it work.  The wcapi gem will utilize libxml for most xml processing (if it’s on your machine, otherwise it will downgrade to rexml), save for a handful of functions that done use named namespaces within the resultset (this confuses libxml for some reason – though their is a way around it, I just didn’t have the time to add it). 

So, the gem/code can be found at: wcapi.tar.gz.  Since I’ll likely make use of this in the future, I’ll likely move this to source forge in the near future.

*******Example OpenSearch Call*******

require ‘rubygems’
require ‘wcapi’

client = WCAPI::Client.new :wskey => ‘[insert_your_own_key_here'

response = client.OpenSearch(:q=>'civil war', :format=>'atom', :start => '1', :count => '25', :cformat => 'mla')

puts "Total Results: " + response.header["totalResults"]
response.records.each {|rec|
  puts "Title: " + rec[:title] + "\n"
  puts "URL: " + rec[:link] + "\n"
  puts "OCLC #: " + rec[:id] + "\n"
  puts "Description: " + rec[:summary] + "\n"
  puts "citation: " + rec[:citation] + "\n"

}

********End Sample Call****************

And that’s it.  Pretty straightforward.  So, I hope that everyone has a great pre-conference.  I wish that I was there (and with some luck, I may catch the 2nd half) – but if not, I’ll see everyone in Providence Monday evening.

–TR

Dec 082008
 

As I’ve been working on 0.9, I’ve been trying to migrate few odds and ends into the current 0.8 branch so that I can move them into production faster on our end.  To that end, I’ll be posting an updated to LF by the beginning of next week.  These updates will include:

  1. Update to the Harvester (this will make it a bit more fault tolerant) as well as allowing harvesting of sets (currently) and root level oai providers (not provided currently).  This change required significant changes to the search component as well that deals with the harvested materials.
  2. Auto-detection of namespaces (for oai and sru — needed for libxml)
  3. Removed rexml dependencies for opensearch component
  4. Frozen gems for oai, sru and opensearch into vendor directories (and have added that to the environment.rb file)
  5. Prep code for solr/ferret decision.  I’ll be adding support to use either ferret or solr as your backend indexer for harvesting for 0.9, but some changes are being made to make this easier.  Ferret provides an integrated rails solutions, while solr would provide a hosted index option.
  6. In addition to this, some changes to the libxml module have deprecated a call being used in the oai gem (maybe the sru gem).  I’ll take a look at both this week and update appropriately [as well as keep backworks compatibility if possible]

Something else, we are starting to work with mod_rails.  This allows apache to manage the rails environment — eliminating the need to run rails through packs of mongrels (or other specialized serving mechanism).  I’ll write up something on our experiences for others that might be interested in this approach.

–TR

 Posted by at 9:11 am
Nov 032008
 

The other day, I posted what I seen as some very big concerns with OCLC’s revised policy (currently being reconsidered) on the transfer of records (two of which, I would consider deal breakers).  In this post, I made the argument that maybe it was time to consider breaking OCLC up to reflect what it has become — an organization with two distinct facets: a membership component and a vendor component.  This comment led to a conversation from someone at OCLC who questioned whether I honestly believed that the library community would be better off if OCLC was broken up and it was obvious from our conversation that on this point, we would simply need to agree to disagree.  As a side note, I think that these types of disagreements and conversations are actually really important to have.  I’m always nervous of communities or groups in which everyone agrees since it usually means that people either are not thinking critically or no really cares.  Secondly, I think that we all (OCLC and myself for that matter) want what’s best for the library community — we just have different visions of what that might be. 

Anyway, back to my topic.  Now, I’m going to preface this discussion by saying that this is obviously my own opinion and one that may not be shared by many people within the library community (I really have no idea).  Even within the library open source community, where I’m sure this opinion would be more prevalent (or at least entertained), I’m pretty sure I’m still in the minority.  But as I say, I think that these conversations are important to consider — specifically as we move down a path where OCLC is very quickly positioning themselves to become the library community’s default service provider for all things library (in terms of ILL, ILS interface, cataloging, etc.).

So when I talk about breaking up OCLC, exactly what am I’m talking about?  Well, in order to follow me down the path that I am going to take you, we have to talk about OCLC as I currently see them.  Watching OCLC during the 10 years (I can’t believe it’s actually been 10 years) that I have been in libraries, I have seen a quickening evolution of OCLC from strictly a member driven organization to more of a hybrid organization.  On the one hand, there is what many would consider the membership side of OCLC, that being WorldCat, ILL and their research and development office.  On the other hand, there is OCLC’s vendor arm…a good example of this would be WorldCat Local and WorldCat Navigator.  So how do I make these distinctions — membership services are those that I would consider core services.  These are services that OCLC has developed to add value to what OCLC likes to refer to as the Library Commons (WorldCat).  OCLC’s vendor services are those tools or programs that OCLC sells on top of the Library Commons, of which, I think WorldCat Local/Navigator is a good example.  Now I think that at this point, I know that folks at OCLC (and likely in the membership) would argue that both WorldCat Local/Navigator do provide services that the OCLC membership is currently requesting.  I won’t deny that — however, I would answer that the fact that OCLC treats the Library Commons (WorldCat) as it’s own closed personal community has the unintended affect of limiting the library community’s (and I include both commercial and non-commercial entities in my definition of community) ability to develop new service models.  In effect, we become much more dependent on how OCLC envisions the future of libraries.  Let me try and tease this out a little bit more…

Philosophically, the biggest problem that I have with the current situation is the commingling of OCLC’s treatment of the Commons (WorldCat) and their current strategy of being the sole commercial entity with the ability to interact with the Commons.  I’m a firm believer that the more diverse the landscape or ecology, the more likely that innovation will take place.  We’ve seen this time and time again both inside (Evergreen and Koha certainly have shaken up the traditional ILS market) and outside (web browsers are a good example of how competition breeds innovation) the library community.  However, by isolating the Commons, OCLC is threatening this diversity of thought.  Now, I have a whole set of different issues with the current library ILS community, but in this case, I think that OCLC’s treatment of the Commons, and their ability to leverage that service unfairly skews the ability for both commercial and non-commercial entities to provide innovative services on top of those Commons (and before anyone jumps on me for non-commercial use, let me finish my thoughts here).  Commercially, I’m fairly certain that the current crop of ILS vendors would very much like to provide their own WorldCat Local/Navigator interfaces to their customers, and I’m sure, would be able to tie these interfaces closely with services already provided by the users ILS.  I could envision things like ERM (electronic resource management), simplified requesting, etc. all being possible if the likes of ExLibris or Innovative Interfaces were allowed to build tools upon the Library Commons (WorldCat).  Maybe I would like to develop my own version of WorldCat Local/Navigator that interacts with the Commons and sell it as a product (kind of the same way ezproxy was sold prior to being acquired by OCLC) or a group of researchers would like to do the same.  As a commercial entity, I’m fairly certain that this type of development model wouldn’t be kosher with OCLC unless I licensed access to WorldCat (and I’m not certain that they would given that this would compete against one of their services).  Likewise, open source folks like LibLime or Equinox may like to create an open source version of the WorldCat Local interface.  Under the current guidelines, I understand that an open source implementation of WorldCat Local can exist — but as I understand that agreement, I’m not certain that groups like LibLime or Equinox (or another entity) could not take that project and then sell support-based services around it (I’m unclear on that one though).  However, it’s very unlikely that the library world will see any of these types of developments (well, maybe the open source WorldCat Local since I have a group that could use this and a number of people interested in developing it) because OCLC has come to treat what it calls the Commons (WorldCat), as it’s own personal data store.  There’s that commingling again. 

So if it was up to me, how would I resolve this situation?  Well, I see two possible scenarios. 

  1. Open up WorldCat.  OCLC likes to refer to WorldCat as the Library Commons — well, let’s treat it as such.  Remove the barriers for access and allow anyone and everyone the ability to essentially have their own copy of the Library Commons and it’s data.  Now, rather than specifying terms of transfer and telling libraries under what conditions they can and cannot make their metadata available to other groups, the membership could consider what type of Open Data license that the Commons could be made available under.  Something like the creative commons share alike license which allows for both commercial and non-commercial usage, but requires all parties to contribute all changes to the data back to the community (in essence, this is kind of what Open Library is doing with their metadata) may be appropriate.  OCLC would be free to develop their own products, but the rest of the library community (both library and vendor community) would have equal opportunity to develop new services and ways of visualizing the data found in the Commons.  Does this devalue the Commons (WorldCat)?  I don’t think so — look at Wikipedia.  It uses this model of distribution, yet I’ve never heard anyone say that this devalues it’s content.  Would there be challenges?  For sure.  Probably one of the biggest would be the way that it would change what it means to be a member of OCLC.  If each person could download their own personal copy of the Commons, would libraries stay members.  I’m certain that they would — but I’m sure that what it means to be a member would certainly change.
  2. Split OCLC’s membership services from OCLC’s vendor services.  Under this example, WorldCat Local/Navigator development would be spun away from OCLC as a separate business (this happens in academia all the time).  Were this to happen, OCLC would be able to develop terms for license that could then be leverage by all members of the commercial library community removing the artificial advantage OCLC is currently able to leverage (both in terms of data and deciding who is allowed to work with the Commons).  In all likelihood, I think that this model likely represents the smallest change for the membership and would continue to allow OCLC to make the Commons more available to non-commercial development without artificially limiting other groups interested in building new services. 

One last thought.  In talking to people today, I heard a number of times that OCLC restricting access to the Commons was in fact good thing, in part, because it finally allowed the library community the ability to leverage resources not available to the vendor communities.  In some way, we could finally stick it to them.  That’s fine, I’m all for developing tools and services, but this particular type of thinking I find worrisome.  If we, as a community, feel that we are unable to develop compelling tools and services that are able to compete with other vendor offerings without an artificial advantage — well that’s just sad and says a little something about how we see ourselves as a community.  And this too is something that I’d like to see change because if you look around, you will see that there are a myriad of projects (Koha, Evergreen, VuFind, Fedora, DSpace, LibraryFind, XC Catalog, Zotero, etc.) where developers (some library developers, some not) are re-envisioning how they see many of the services within the library and putting their time and effort into realizing those visions. 

 

–TR

 Posted by at 10:49 pm
Oct 082008
 

An interesting question brought up by David Seaman here at the Readex Digital Institute.  David provided the opening keynote for the conference and in it, discussed a process that Dartmouth College went through this year to consider how the library can become more nimble within our networked world.  How the library can be given a license, if you will, to allow the library to release services that aren’t perfect and are iterative in their development cycle.  And while I can certainly get where David is coming from, I don’t think that the logical outcome is a service model that lives within perpetual beta.

I think that part my objection to this phrase comes from my development background.  As a developer, I’m a firm believer in a very iterative approach to development.  Those that use MarcEdit can probably attest (sometimes to their dismay) that updates can come with varying frequency as the program adapts to changes within the metadata community.  Since MarcEdit doesn’t follow a point release cycle in the traditional sense, could it be a beta application?  I guess that might depend on your definition of beta — as it certainly would seem to meet the criteria that Google applies to many of it’s services.  However, there is a difference I think.  While I take an iterative approach to development, I also want to convey to users that I will support this resource into the future and as beta, that’s not the case.  My personal opinion is that services that languish in beta are doing two things:

  1. telling users that these services are potentially not long-term, so they could be gone tomorrow
  2. giving users a weak apology for the problems that might exist in the software (and deflecting blame for those issues, because hey, it’s beta).

So, I don’t see this as the road to nimble development.  Instead, when I heard David’s talk, I heard something else.  I believe libraries fail to innovate because as a group we are insecure that our users will leave us if we fail.  And when I hear talks like David’s, that’s what I’m hearing his organization say.  They are asking the library administrators for permission to fail, because what is technological research but a string of failures in search for a solution.  We learn through our failures, but I think that as a community, our fear of failing before our users can paralysis us.  The fear of failing before an administration that does not support this type of research and discover can paralysis innovation by smart, energetic people within an organization.  A lot of people, I think, look at Google, their labs, their beta services and say, yes, that is a model that we should emulate.  But they don’t fully understand I think that within this model, Google builds failure as an acceptable outcome into their development plan.  If libraries want to emulate anything from the Googles or the Microsofts of the world, they should emulate that and engender that type of discovery and research within their libraries. 

I am actually very fortunate at Oregon State University in that I have a director and an administration that understands that for our library to meet the needs of our users now and in the future, we cannot be standing still.  And while the path we might take might not always be the correct one, the point is that we are always moving and always learning and refining our understanding of what our users want today and tomorrow.  What I’d like to see for David’s library is that kind of environment — and I wish him luck in it.

–TR

 Posted by at 9:53 am