I’ll write more about Code4Lib, but one of my favorite parts of Code4Lib was seeing David Walker’s session providing a first look at OCLC’s grid services. Very promising. If OCLC can deliver a robust api for both member and non-member libraries and provide the infrastructure to make this happen — then maybe it will allow like-libraries the opportunity to build ad hoc consortia around these services. Of course, this does involve having a very robust infrastructure (i.e., data centers) and an api flexible and open enough to provide non-members compelling functionality so not to be left behind, while still providing members access to all the suites of data that they currently license through OCLC.
Fortunately, I was invited to be in the next group of librarians to get their hands on these services. So long as I am able, I’ll periodically post about my experiences and possibly screenshots of things that we try out.
On Tuesday, I’d mentioned that I’d been spending my evening working on approximately 6 GBs of data and mentioned that I’d explain later. Well, it’s later, so I probably should explain.
Some background…Brewster Kahle, the founder of the Internet Archive (among other things), was the opening Keynote speakers at Code4Lib. We’d had some trouble thinking of a good speakers gift for him, but in early Feb., Jeremy and I had the opportunity to visit the Internet Archive and an opportunity to speak with Brewster and the folks from the Open Library. While there, he told a great story and the idea kind-of got rolling from there. Basically, Brewster talked about a visit that he’d made to Japan, and on the visit, he presented as a gift the entire Japanese domain from 1996-2001ish. While of course, gave me an idea. I wanted to do the same with Summit, and give Brewster a snapshot of the 12 million MARC records found within Summit.
So, from the idea, I sent a message out to a number of libraries in the Summit consortia making a request for records. There were questions, some excitment, and in the end, we got a number of libraries that contributed close to 5 1/2 or so million records (~3 million unique). Specifically, OHSU (and their members), Lewis and Clark, Portland Community College, Washington State University and Oregon State University provided me copies of their catalogs. I did some data processing (to translate data to Unicode, remove some records that people didn’t want contributed, etc.) and put the records on a jump drive. We presented the records to Brewster after his talk (a great talk by the way), and I think it was something that he didn’t expect and genuinely appreciated.
But the story won’t stop there. A number of libraries in Summit that were not able to provide there records before Tuesday still want to contribute their data. In fact, tonight, I’m processing close to 1.2 GB of data from Western Washington University — removing some requested vendor records and processing the data into Unicode and will hand deliver these records to the folks at the Open Library on Feb. 29th at the Open Library Developers meeting. Once these records are contributed, it will bring the total number of institutions that have made a decision to share records to 8 with more coming in the very near future. I’m still hoping that at some point, we will be able to contribute the entire Summit database (which will make us almost the single largest contributor of bibliographic data to the project), but for now, I’m just grateful to be in a consortia with members willing to be experiemental and a little bit a head of the curve. 🙂 Way to go Summit!
And so it begins. Well, actually begins yesterday. Yesterday Jeremy, Tami Herlocker and I gave a LibraryFind preconference. Kindof an installfest that didn’t turn out that way thanks to Parallels on my laptop. That’s ok — lots of good questions afterwards and now we have a very good install document — but still a little disappointing.
Afterwards, I got to spend a good deal of time chatting with folks before retiring to my room to chew on 6 GBs of data (I’ll post why later). Anyway, I think my favorite part of this get together is, that I get together with folks that I have seen for a while. Anyway, I’ll periodically post during the day as the conference unfolds (at least, that’s the intention :))
This is one of those things that I wanted to get done a long time ago, but finally have found the time. I’m working on updating MarcEdit to allow users to pull records directly into the MarcEditor from the Z39.50 client and then post them directly to a Z39.50 service that supports record updates. At this point, the UI may change (I’m going to let a few people try it out first to give me feel for how it works), but at this point, here’s the UI.
Once the MarcEditor is opened, click CTRL+I or follow the menu to Tools/Z39.50/SRU Options/Z39.50 Import Record (see below):
Selecting this option will open the following window:
Initially, this tool will be tied to MarcEdit’s Z39.50/SRU client. Users will be able to select any item that has been added to your local repository of databases. So, if you don’t see a database in the list, you will want to open the Z39.50 tool and select it from the master list.
At this point, it works like any other client. Run a search, double click on the item in the results list. It will open in the Editor. Make your edits, then select CTRL+U or the Record Upload from the tools menu.
I’ll post a webcast when finalized, but hopefully I’ll get a chance to wrap this up fairly quickly.
For a rails project, when I want to update the vendor/rails it appears that the easiest way to update this in svn is to delete the directory, refresh the gems and then re-add the directory to the svn. Does this sound right? Seems like it should be easier. This has been the process that we have used for about a year and after going through it again (to update to the 2.0.2 framework), I’m just thinking that its got to be easier.
Apparently, Dr. Adnan Qureshi, a stroke expert at the University of Minnesota performed a study recently that found cat owners to be less likely to die of cardiovascular disease than dog owners. The Minneapolis Star Tribune ran this story (http://www.startribune.com/lifestyle/health/15858742.html [appearently, this page was only available for a little while — seems to be carried here as well: http://www.sacbee.com/749/story/730732.html) which was picked up by papers around the country. Apparently, cats provide better stress relief than dogs — though, I would bet that labs must have been excluded from the study. I get plenty of stress relief from my 80 lb lap dog that is loyal, unconditionally trusting, drags me a around the neighborhood on our “walks” — oh, and displays the judgement of an 80 lb toddler. And of course, anyone that has watched Cats and Dogs know that the only thing keeping our feline enemies in their place is our furry friend, the dog. 🙂
Like a number of people, I found the following piece (http://chronicle.com/weekly/v54/i24/24a01101.htm) from the Chronicle of Higher Education on the Open Library fairly interesting — in part, because of the topics that the author chose to highlight. I tend to categorizes pieces such as this as fluff, in that one rarely gets any content of substance from them. However, in a short article about the Internet Archive’s Open Library initiative, I found it interesting that so much of the article centered around OCLC, or, should I say, the silence coming from OCLC as members seek to clarify OCLC’s position in regards to the Open Library and it’s members potential participation in this project. Two things that jump out:
“Librarians are not just uneasy having nonlibrarians edit catalogs; they are also afraid of offending OCLC.”
An exceptional understatement, though one that doesn’t extend just to the Open Library. As a general rule, I find that librarians are way to concerned with offending OCLC, with many having a feeling that should an offense be taken, that it could have long running repercussions for the institution. Are these concerns valid — for OCLC — I think not. While I firmly believe that OCLC occupies the same vendor space as other entities like EBSCOhost, Elseiver and Serial Solutions, I think that they are much more responsive to their members customers — due in part to the organization’s roots as a large co-opt. Of course, librarians and libraries have been conditioned to believing that consequences will follow if one rocks the boat or steps on their partner’s toes. And unfortunately (and much to my chagrin), I’ve had occasion myself to say or post opinions that have cause push back from content/software providers currently serving Oregon State. Fortunately, my director doesn’t mind when the pot periodically gets stirred, but not everyone is as lucky. So, I can certainly understand where the nervousness is coming from.At the same time, I think that OCLC is contributing to this sense of uncertainty. OCLC hasn’t been caught by surprise by the Open Library’s development work and certainly hasn’t been surprised by the Open Library asking OCLC members to contribute data to the project. For close to a year, OCLC has had the opportunity to provide some form of guidance or position, as it relates to the Open Library project. Instead, they have been silent. This leaves librarians and libraries to consult their local OCLC representatives who have been given widely varying information regarding the legality of participating in this project. While I’ve yet to hear of anyone being told that a library could not participate in the project, it has been quietly discouraged by OCLC’s deafening silence.
“But one OCLC official, speaking on the condition that he not be identified, said Open Library was a waste of time and resources, and predicted it would fail.”Again, it’s interesting that in a piece like this, that this comment would make it’s way into the article. Whether or not this reflect’s OCLC’s current position on this particular project, I think that a number of good things may come out of the Open Library project, even if indirectly. First, OCLC’s grid services. While likely not a direct result of the Open Library’s project, I’d guess that the current desire to accelerate their availability is in response to the growing number of projects currently looking to move into the space the OCLC has traditionally monopolized. Yes, let’s call it what it is, in this space, OCLC functions as a monopoly, because OCLC has essentially been allowed to rely on it’s position to squeeze out competing projects (RLG) and leverage their data to create services that would be otherwise impossible to create without the metadata that OCLC currently possess. I think to some degree, projects like the Open Library give OCLC pause in the sense that at present, they see their bibliographic and holdings content, WorldCat, as their crown jewel. It represents a body of work that exists no where else in the world and gives them a potential advantage over any cloud-based service being developed within the library community. At the same time, as OCLC goes forward and libraries become more interested in building some of their own tools (either individually or as part of a consortia), I think that WorldCat, and the data beneath it will actually become less important for OCLC — rather, it will be the services that they develop on top of it that will hold the most value. And I think that projects like the Open Library have accelerated this development. As Martha Stewart would say, it’s a good thing.
Secondly, I think that this quote is interesting in a larger sense as to how it relates to OCLC as a whole. They are undergoing big changes — business changes, philosophical changes and I think that this represents that to some degree. As the piece notes, OCLC’s public face see cooperation as a good thing, while maybe privately, that’s not the case. But honestly, I think that this is healthy. OCLC is hiring a lot of bright people and has traditionally had a lot of bright people on staff and what we see is that they are thinking about these issues and how they relate within the larger community (even beyond OCLC). Now, whether or not OCLC is particularly happy that these disagreements are being aired publicly (something that hasn’t traditionally happen), well, that would be something to keep an eye on as well.
[update: Spell check fails me again, sorry Martha]
Finished adding functionality to support different DPI settings. This corrects the oai harvester, the marc tools and some marceditor layout functionality.
Batch File processor — fixed a small error in which the program would save marcxml=>marc conversions as xml files (rather than mrc files)
Save/Save As — when harvesting (and other options that require temp file creation) — save/save as saves to the temp file. This has been corrected
Compile Records — the MarcEditor can open MODS and MARCXML files directly — converting them to MARC for editing. When you save, it converts them back to xml files. However, if you run the compile code, it now will compile the data to MARC as well (as one would expect).
Couple of quick updates to the program. Added a new variable to the global vars passed when doing xslt transfers (currently, these variables are, destfile, sourcefile and pdate. You access these as global parameters in your xslt file. Two other changes. Added the Compile Individual Selected Record(s) back into the 5.x branch and lastly, updated a few forms so that they work better when you are running Windows at 120 DPI (rather than the traditional 96 DPI).
I’ve been spending some time in Regina, Ca. this weekend and I can’t think of any time that I’ve been in such bitter cold. When my plane landed on Saturday, a little afternoon, it was the high of the day around -29 F (but felt like -49 F with the wind). By nightfall, it was registering around -37 F (-60 F with wind chill). This Sunday morning, -34 F (-49 F with wind chill). I spent about 20 minutes outside walking around and tromping in the snow and when I came in, the bare spots on my face were literately “burned”. Wild. Thank goodness we don’t see this kind of weather in Oregon.