Top take-aways from Code4lib 2009

By reeset / On / In code4lib09

This is a very abbreviated list that I’ll expand with a longer, more formal report, but this years code4lib left me with a few specific take-aways:

  1. I need to spend some more serious time considering how Linked Data projects currently being developed could impact my library.  I followed peripherally the work that Ed Summers did with before it was taken down, but with more players (including LC) formally moving into this space, I think that the ability to build linkages to data on the web is going to get a whole lot easier.
  2. I want something like Brown’s Dashboard imitative at my library.  The ability to track statistically information about our library services and serve it a way that it could be used by decision makers at all levels is very attractive.
  3. I’m going to be setting up a djatoka server when I get back to OSU.  djatoka is an open source JPEG2000 server created by the folks at Los Alamos Labs that certainly looks robust and easy to use – and dove tails nicely into a project that I’m personally interested in spending more time on at OSU.  Recently, OSU jointed Flickr commons and has started to release images to the collections.  In Flickr, these images are seeing in a week about the same usage as we see in a year for collections in CONTENTdm.  So, it makes me wonder if CONTENTdm, though a fine platform, is the most efficient way of making our image collections available – especially given that many of the archival functions could be accomplished using something like DSpace, our IR software. 
  4. I’m still not a believer in the OLE project (and to some degree, the XC project – though that seems to have more legs to me since they are farther along).  I know that I just came from an open source/project lovefest and generally we don’t criticize interesting projects but… Both of these projects are doing a lot of good work and I think will contribute a great deal of research and information regarding functionality and workflows that will help services develop into the future.  However, I think that both of these projects are 3-4 years too late.   I honestly believe that the ILS as we know it is dead (I know, someone should tell libraries – but its coming).  I think that if OCLC every gets their act together and really makes a commitment to open the WorldCat database, then that is a game changer.  But whether it’s through products like WorldCat Local, Summons, etc., I think that the idea of a localized catalog will be an antiquated concept by the time the OLE (and likely the XC) products have finally reached a level of production.  I certainly could be wrong (and would be happy to be so), but the needs that these projects are looking to solve have solutions now (if just a handful of barriers preventing the free flow of data were removed).
  5. I’m ready to talk about something other than Solr.  Yes, its cool.  Yes, it is driving many of the next gen. library interfaces and is being used by libraries in lots of exciting ways – and yes, now we’ve spent a good deal of time talking about it for 3 years — let’s talk about what’s next.
  6. I actually left feeling a little bad for OCLC.  Now, I certainly have no  problem letting it be known when I think that they are acting outside of the interests of the membership.  Or, letting them know that I think that their current service model for their API sucks.  Or that they screwed up when they tried to modify the records transfer policy.  But at the same time, we do need to give them credit where it is due.  OCLC’s grid services team is doing a lot of good work trying to create a set of APIs that will finally let the membership work with data available in WorldCat.  Yes, there are gaps.  Yes, the current terms of service limit the API’s usefulness tremendously – but at the same time, a year ago these didn’t exist and from my conversations with OCLC, they are showing a real willingness to listen to comments as to how they can make this service useful for members.  And while I doubt it will be as open as I’d like, I think that in time (and I wouldn’t be surprised if this was near time) that we will see some significant changes to the service terms to allow libraries to make use of what could be a tremendous resource.  And while there was a general interest in what OCLC was doing, there was also some good natured, and not so good natured fun poked at OCLC’s expense (of which, Roy likely bore the brunt).  I say, push them, prod them – disagree with them – but there have been some really impressive gains made over the last year, I don’t think that OCLC is getting credit for that.  (though, if they don’t do something this year to make this API more open – well, I’ll have to take some of this back).


So those are my quick thoughts.  I’ll post a more comprehensive write-up over the weekend (likely).


Yann Martel’s Life of Pi

By reeset / On / In Literature

I find that anymore, it’s very rare that I get an opportunity to really sit down and read a book simply for the fun of it.  As former English major and current librarian, this might seem sacrilegious, but I generally don’t have the time.  And when I do, I like to know that the books that I read will be worth reading, which is likely why I tend to fall back to reading the classics – works that I enjoy on each read simply because I take something new away from them on each pass.

However, about a month or so (maybe more I guess now), my wife had read Life of Pi by Yann Martel and really enjoyed it.  She would tell me about it, and honestly, from the descriptions, it didn’t leave me very excited.  However, as I prepared to travel to Code4Lib, I decided I’d take all the free time that I was going to have on planes (close to 10 hours in all) and give the book a read.

For those that haven’t (and may) want to read this book, I won’t give it away.  But I will say a few things about it.  This was a book that I found that I really enjoyed, though it is to be endured during the first third or so of the book, much in the same way that Herman Melville’s Moby Dick is endured during the endless chapters on whaling (though, unlike the whaling chapters, the reader that endures is rewarded throughout the book).  The story follows the tale of a Indian boy, who’s family is lost at sea when their ship goes down in the Pacific.  But he’s not the only survivor – as he finds himself sharing his lifeboat with a Zebra, a hyena and a Bengal tiger.  Told as a narrative, the book tells the story of the young boy as he survives on the Pacific Ocean for 277 days and the uneasy bond that he shares with the tiger…maybe.

While this book may sound like a Robinson Crusoe story, it’s not.  The story has many layers, but I think what struck me most about the book was one of perspectives and faith.  Throughout the book, Martel tells two stories – though it is not until we reach the end of the book that these two stories start to become clear and we see how each story is reflected in the other.  At the end of the book, we are invited ourselves to pick the story that we wish to believe – as well as try to rectify for ourselves, the story that has been told. 

Overall, I found this book to be a very good read.  But more importantly, one that compelled me to go back and re-read sections of the book as well as reconsider my own initial impressions of the stories being told.  So if you are looking to add a book to your reading list, I highly recommend it.


‡ MarcEdit Plug-in

By reeset / On / In code4lib09, LibLime, MarcEdit

So, one of the presentations at Code4Lib this year discussed one of the latest initiatives to come out of the LibLime company, ‡  ‡ is a repository of ~30 million MARC records released under an Open Data license. 

From my perspective, one of the things that I found most interesting about the ‡ platform is the support for developers.  ‡ provides a very simple to use push/pull data model over simple HTTP.  It’s something that I’ve been interested in taking a look at for a little while and after chatting with Joshua Ferraro after his session, decided to see how difficult it would be to actually work with this service.

So, I decided to create a plug-in.  Much in the same way MarcEdit has a Connexion batch editing plug-in, I created a ‡ helper plug-in.  I’ll make the plug-in available for download with the next version of MarcEdit through the Plugin Manager (so, likely when I get back from Code4Lib), but have posted the source code (in C#) and a short youtube video for folks wanting to see how it works.

While I’m not sure if ‡ will catch on as a service, I’m very impressed by LibLime’s effort to create a large, shared cataloging database that comes with a set of API to allow developers the ability to integrate directly with it.  Hopefully, OCLC will follow LibLime’s lead and eventually add push functionality integration for WorldCat – allowing membership and library developers an opportunity to develop their own cataloging interfaces to the resource.  Until then…

Youtube video (I think youtube has finished processing it – if note, check back):

C# Source code:


Finally in Providence

By reeset / On / In code4lib09

Whew – made it in early enough to make my afternoon meetings and chat with a few folks.  The flight today was great.  Got upgraded to first class for the lion’s share of the trip, and got to get in some reading (the Life of Pi – so far, so good).  After my meetings, got in a great four mile run and am now just relaxing doing a little bit of coding.  Anyway, looking forward to the rest of the conference. 


On my way to Providence

By reeset / On / In code4lib09

So, while my trip to Providence got off to a rocky start yesterday, I must say that I had a fantastic experience working with the Delta representatives getting it resolved, and as a bonus – this is the first time travelling as a Gold Medallion member – so when I checked in, I was automatically bumped to first class.  Rock on!  Hopefully the changes don’t have an adverse affect on the backend (I’m flying home on United and have had trouble in the past when tickets for one leg needs to be re-issued) – but as of today, I’m looking forward to flying in style.

See you all soon in Providence!


WorldCat API: draft ruby gem

By reeset / On / In OCLC, ruby

Since the WorldCat Grid API became available, I’ve spent odd periods of time playing with the services to see what kinds of things I could do and not do with the information being made available.  This was partly due to a desire to create an open source version of WorldCat Local.  It wouldn’t have been functionally complete (lacking facets, some data, etc), but it would be good enough, most likely, for folks wanting to switch to a catalog interface that utilized WorldCat as their source database, rather than their local ILS dbs.  And while the project was pretty much completed, the present service terms the are attached to the API make this type of work out of bounds at present (though, I knew that when I started – it was more experimental with the hope that by the time I was finished, the API would be more open). 

Anyway…as part of the process, I started working on some rudimentary components for ruby to make processing the OCLC provided data easier.  Essentially, creating a handful of convenience functions that would make it much easier to call and utilize that data the grid services provided – my initial component was called wcapi. 

Tomorrow (or this morning if you are in Providence), a few folks from OCLC will be putting on a preconference related to the OCLC API at Code4Lib.  There, folks from OCLC will be giving some background and insight into the development of API, as well as looking at how the API works and some sample code snippets.  I’d hoped to be there (I’m signed up), but an airline snafu has me stranded in transit until tomorrow – so by the time I get to Providence, all of, or most of the pre-conference will be over.  Too bad, I know.  Fortunately, there will be someone else from Oregon State in the audience, so I should be able to get some notes.

However, there’s no reason why I can’t participate, even while I?m jetting towards Providence – and so I submit my wcapi gem/code to the preconference folks to use however they see fit.  It doesn’t have a rakefile or docs, but once you install the gem, there is a test.rb file in the wcapi folder that demonstrates how to use all the functions.  The wrapper functions accept the same values as the API – and like the API, you need to have your OCLC developer’s key to make it work.  The wcapi gem will utilize libxml for most xml processing (if it’s on your machine, otherwise it will downgrade to rexml), save for a handful of functions that done use named namespaces within the resultset (this confuses libxml for some reason – though their is a way around it, I just didn’t have the time to add it). 

So, the gem/code can be found at: wcapi.tar.gz.  Since I’ll likely make use of this in the future, I’ll likely move this to source forge in the near future.

*******Example OpenSearch Call*******

require ‘rubygems’
require ‘wcapi’

client = :wskey => ‘[insert_your_own_key_here’

response = client.OpenSearch(:q=>’civil war’, :format=>’atom’, :start => ‘1’, :count => ’25’, :cformat => ‘mla’)

puts "Total Results: " + response.header["totalResults"]
response.records.each {|rec|
  puts "Title: " + rec[:title] + "\n"
  puts "URL: " + rec[:link] + "\n"
  puts "OCLC #: " + rec[:id] + "\n"
  puts "Description: " + rec[:summary] + "\n"
  puts "citation: " + rec[:citation] + "\n"


********End Sample Call****************

And that’s it.  Pretty straightforward.  So, I hope that everyone has a great pre-conference.  I wish that I was there (and with some luck, I may catch the 2nd half) – but if not, I’ll see everyone in Providence Monday evening.


We must be doing something right

By reeset / On / In Family

Thursday, we sat down and met with Kenny’s teacher for a parent/teacher conference.  He’s doing exceptionally well.  He reads, writes, etc. all above grade level — and I understand is quite the orator during their sharing times. 🙂

Anyway, during the meeting, Kenny was asked to pick out a story that he’d written (a short one) and this is the one that he’d picked.


For those that might have a hard time reading phonetically (this is the first year where he’s really started worrying about spelling — but in general for stories, they are told to just write and not worry about spelling), it says [without correction]

Wen i grow up i want to be just like my dad.  I will be just like him.  Giving speches all a round the werld.  Doing stuf on the computer and do wat my dad did to the tv.

A sweet story — and one that reminds me that my two boys are always watching what I’m doing.  Oh, and the stuff that he was talking about with the TV is noted here:


Trying out Windows 7 with the TV

By reeset / On / In Media PC, Netbook, Windows 7

One of the benefits of having a large flat screen TV is the wide abundance of video inputs.  At present, there are 8 on our TV, 4 HD,  2 RCA style, 1 cable and 1 PC connection (DVI).  This leaves lots of room for experimentation. 

As noted on Saturday (, I picked up a netbook, the MSI Wind (  Officially, the netbook is for my wife – who has taken up blogging and book reviewing as a hobby.  It gives her a chance to work away from the desktop and in the living room with me in the evening.  However, unofficially, it’s also a toy for me to try new things, as well as play with a netbook (as well as evaluate the feasibility of getting these for work). 

Anyway, one of the first things that I did with the Netbook was drop Windows XP for the test build of Windows 7.  I’ve been looking for a place to try it, and since my wife has been using Vista for a little over a year (I like the parental controls) – I figured that Windows 7 wouldn’t be too much of a difference.  As I noted in the previous post, I was very pleased with how well it has run on this netbook.  In all honest, it boots faster than the original XP system (takes about 10 seconds to go from a cold boot to active) – can run the areo interface on much lighter hardware than Vista (the netbook for example has no problem with it using the integrated video card), and as I spend time working with the system, have been really impressed with the polish in a beta build (especially when you consider what the first public Vista beta looked like). 

So, that was a long explanation for my current project.  I have a lot of computers at home.  The oldest machine that I still use occasionally is a machine that I purchased from Dell in 1998 (I just keep updating the components) up to our current desktop.  Well, yesterday, I was sitting around and wanted to watch an episode of the Simpsons .  I didn’t have the particular episode that I was interested in watching, so I went online and found it (on hulu I think).  Anyway, I hooked up the netbook and feed the video directly to the TV and it worked so well through the DVI cable that I thought I’d setup my own Media PC and see how well I could get it to work.

So….I dusted off a 4 year old media pc that I rarely turn on and decided to wipe the hard-drive and install Windows 7 and see how the new media functionality worked on it (as well as see how Windows 7 worked on a really old system that I wouldn’t touch Vista with).  Again, like with the netbook, the install took approximately 20 minutes to install and apply current updates and create user account.  As well, Windows 7 recognized all the old hardware, and appropriate drivers were added.  From there, I moved the computer next to the TV and let the experiment begin.  I figure, if I like this setup, I’ll likely end up purchasing a light-weight PC that I can use as a media machine – but for now – this will do.  With the current setup, the machine can stream content directly to the TV (unfortunately, not recording yet – though I might pick up a cheap TV input card so I can setup a simplified DVR recorder).  So right now, I’m hanging out at home watching the last episode of the Office (which I missed), an episode of He Man, masters of the Universe (I love old cartoons) and then an episode of Married…With Children (another one of my guilty pleasures).  I’ve been primarily watching episodes via Hulu – and I’m surprised at the video quality.  It’s not DVD quality – but certainly as good an anything off VHS or cable.  I’m not sure if this is because of Hulu’s quality, my high speed internet (thank you Monmouth-Independence) or a combination of things. 

But so far, so good.  It looks like this little setup will work great for testing and seeing why so many people are getting DVR devices.  If anyone has done something similar, give me a shout.  Otherwise, I’ll continue to post updates as I (and my boys) spend time feeding videos through our new test media PC (the boys like watching cartoons online as much as I do, though they seem to be more interested in the current content on Cartoon Network).


MarcEdit update notes

By reeset / On / In MarcEdit

Since a few folks have asked, the update mentioned last week is coming along nicely.  At this point, I have a number of MarcEdit users that are testing new features and the install program (both individually and in an enterprise environment).  Things are looking good, so it will be out the door as soon as all reports have come in.  Some notes again on the update:

  • Extending the Z39.50/SRU client to provide multi-target searching.
    This is a fairly often requested feature, but I’ve always held off adding it as a courtesy to OCLC (a number of years ago, I’d received a request when MarcEdit was set to add a Z39.50 client to keep it simple.)  It’s been a long time and times have changed — so I’ll be lifting this restriction slowly.  To start, the program will allow searching of 3 databases at once.  As I finish rethreading the requesting engine, I’ll likely remove even that restriction. 
  • Improved large UTF-8 file support in MarcEdit:
    MarcEdit has what I call a preview mode.  It essentially opens a snippet of a large file into an editor so users can preview global changes that they might run on a record.  This is how I like to work with large file sets.  However, some people like loading the entire file into a viewable space.  Now files under 30 MB — these are fairly harmless and easy to load into any traditional editor — but files over 30 MB are difficult to view in Windows.  The reason for this is memory.  For every byte loaded, it takes ~4 bytes of RAM to view.  So, a 30 MB file consumes 120 MB of system memory.  A 60 MB file would consume 240 MB of memory, and so on (I’ve oversimplifying this, but you get the general idea).  In MarcEdit, the program has always traditionally relied on a customized version of what is called the RichEdit component.  This is the same component Microsoft leverages in Wordpad.  It provides fairly good UTF8 support, as well as a good deal of formatting support.  The downside is speed.  When loading lots of data, the component tends to get bogged down, as part of the loading process is converting data not in richtext to richtext on the fly.  For smaller documents, the conversion isn’t noticed.  For larger documents, its a real time drain.  Likewise, charactersets like Hebrew, Arabic and Asian sets cause it all kinds of speed and resource problems.  So after sending emails back and forth to Microsoft’s support, I finally came up with a solution — write my own text rendering built upon the general textbox components in Windows.  It took a while, but I’ve got it working and early reports from testers is that it’s faster and consumes less memory.  Of course, since I spun my own, I also set a hard limit on the file total file size that can loaded into the editor at a half GB.  In reality, you’d run out of system memory before you ever were able to load a 1/2 GB, so this seemed more than reasonable.  Of course, if you have files larger than a 1/2 GB, you can load them just fine into MarcEdit for editing, so long as you utilize the preview mode.
  • Installation:
    You wouldn’t think that this would be a big deal.  I mean, installing a program is one of the basic parts of program design.  However, when you are talking about a large number of users or an enterprise system, it gets very complicated.  MarcEdit previously utilized an installer that relied on the Nullsoft installer with a number of plug-ins and shims that I’d written myself.  It was great for what it did, but poor for large enterprise level installations.  However, up until recently, I really didn’t care because most people that use MarcEdit simply download and install it.  However, times are changing as more libraries are moving from in-house IT management to utilizing centralized IT management.  The downside of this is that desktop management is all done through a central switch — with users losing the ability to install and manage their own software.  In the past month, I’ve had two system administrators (one from a very large university system) contact me inquiring about changing the install model.  They want to be able to manage the program centrally.  Made since, but when you have a user community measured in the tens of thousands, making these types of significant changes take a lot of planning and work because you need to bring everyone along at the same time.  There are a lot of gory details involved in making this work — but after many long nights, I finally was able to slay the beast, which is the windows installer, and come up with a package that would be usable.  Yeah me!  Right now, I’m having users from a wide variety of institutions, that have done various levels of customization, testing the installer within their environments.  So far, the only comments received have been cosmetic — which makes me think that this is close.


One additional note.  I’ll be participating with a number of folks in the coming year on examining RDA and how it will require tools to change to support the new models.  As I work on this and make changes, I’ll try to relay any pertinent information back to the user community.



New toys

By reeset / On / In Netbook, Windows 7

So like most folks, I’m hanging out today kind of watching the super bowl – but am also playing with a new toy, my MSI Wind netbook.  This is a nifty little device, running a 1.6 ghz, 1 gb, 160 gb HD, 10.1 inch monitor weight about 2 lbs.  The machine came installed by default with Windows XP – but part of the reason why I decided to pick up this little device was to have a platform to test Windows 7, and the netbook seemed like as good as place as any to test out the claims that this OS will take less resources, be more efficient, etc.  So, occasionally, I may jot down a few notes of my experiences using my little advice.

First thoughts are I like this little machine.  It has a full keyboard so I don’t feel cramped typing, has an expandable drive (so soon I’ll have 2 GBs on the machine), 3 USB drives, an SD card reader – all the stuff that you really need to get things onto and off of the machine.  The device seems fairly rugged and cannot believe how light it is.  I’ve tried the ASUS, Acer and HP netbooks and I really like the design that the folks at MSI have used here.  Oh, and the 6 cell battery – looks like I’ve been getting about 3 1/2 hours of burn time.  Oh, and this is quiet and cool.  The laptops that I have at home simply run hot (I’m rarely cold during the winter so long as I have my laptop).  This little device simply doesn’t put out much heat and is super quiet.  A nice, unexpected change from previous hardware used.

Windows 7…At home, I use a wide number of operating systems.  I have a couple of flavors of linux, XP and Vista (no mac since they won’t let me virtualize it), mostly to test various projects that I work on under different systems.  One of the hallmark of Windows installs tends to be scary installs.  I was pleasantly surprised to see that install tool approximately 18 minutes on this netbook – all hardware was recognized, drivers loaded.  Since, I’ve loaded Google Chrome, Office 2007, Live Writer (which is what I’m using now) and MarcEdit.  Overall, I’ve been pretty impressed with the way the system handles.  At this point, I have Chrome open, streaming a video from youtube (bruises from chairlift), 5 other tabs, IE with 3 tabs (one my exchange mail), 3 word documents, an excel document (~4 mbs in size) and MarcEdit and still have ~200 mbs of RAM free and not a hint of sluggishness.  This is something I’ll track as I continue working on this little box.