How demoralizing

By reeset / On / In Cycling

The wind this morning was downright brutal.  No only did it take me longer than normal to get into work — but the gusts were strong enough to nearly knock me off my bike.  I spent the better half of my ride out of my seat and leaning against the wind to keep my balance. 

Now if it will just blow that hard tonight when it will be at my back.


MARBI Proposal 2006-04

By reeset / On / In Digital Libraries, MarcEdit, Programming

So the following was sent out on the MARC Unicode list:

Proposal 2006-04, “Technique for conversion of Unicode to MARC-8,” was
approved at the MARBI meeting on January 21 with the following additions
and changes.
1. It was agreed that a proposal for lossless representation of
unmappable characters using numeric character references will be brought
to the next meeting.  Both the lossy and lossless methods will have
official status, and systems can choose to implement either or both, as
their needs and the needs of their users dictate.

2. Option 1 of the present proposal, which specifies the vertical bar as
the replacement character, was chosen.

3. It was agreed that an explicit list of characters to be decomposed
prior to conversion was required.  RLG and OCLC will assist in the
creation of this list, which will be made available at the MARC 21 web

Gary L. Smith
Software Architect
Product Architecture and Development

So the good news, in my opinion, is that both a lossless conversion will be supported — which I think is great…partly because MarcEdit 5.0 already supports the approved format.  In the next couple of days, I’ll also add support for the lossy conversion, where the pipe is utilized as a fill character.


2006 Midwinter Trip Report

By reeset / On / In Digital Libraries, Uncategorized

I have to write a trip report for my institution — so I thought I’d post it on my blog.  Enjoy.

ALA Midwinter 2006 Trip Report
San Antonio, TX
Terry Reese


Ah, winter in San Antonio…certainly it must be warmer and drier than Oregon (at least, that’s the impression that I’ve cultivated).  Well, it was drier (on most days) and warmer (on most days) but I certainly wish I would have remembered to pack my umbrella and maybe a jacket.  Actually, this was an odd trip for me.  I rarely feel out of sorts when I travel, but this time, I really just had a difficult time getting it together.  I think that it started with my airline tickets.  Going through Teels, I ended up with paper tickets.  Paper tickets!  In the 9 years I’ve flown on airplanes, I’ve never had a paper ticket.  I’ve been paranoid for months that I was going to lose these tickets so when I took off on Friday, I made sure I had those tickets…just not much of anything else. 🙂

Friday, January 20, 2006:
Whew!  For those that read my blog, some of this will be old news, but for those that don’t, it was an interesting trip to San Antonio.  It all started first thing Friday morning when I drove up to Portland to catch my plane.  I’d left early to give myself about a 2-2 ½ hour window at the airport.  However, because of an accident on I-205, I ended up parking on the interstate and got to the airport just about when my plane was scheduled to depart.  I thought for sure I would have to go onto standby (I think that’s how it works), but fortunately for me, the plane was late too, so I made it. 
Well, after the above mentioned excitement getting to the airport, the flights and arrival in San Antonio was somewhat anticlimactic.  I made a connection in Denver (and was annoyed by the lack of wi-fi…you’d think that by now, airports would do a better job accommodating passengers) and then to San Antonio where I met up with Joe (Toth).  He picked me up at the airport and we went to this place called Joe Adall’s dinner.  The place boasts that they make the best chicken in San Antonio.  A big boast, but for a dinner, it was packed with a 40 minute wait.  Joe and I skipped the wait though by eating at the counter.  Great food.  Both Joe and I ate way too much.  After dinner, we drove back to the airport and picked up Kyle and Shirley and took them to their hotel.  Since it was late, Joe took me back to my place and I started to pull my stuff out to prepare for my talk tomorrow…and at that point, it hit me.  My USB drives, my prepared committee notes, my slides, my USB drives…they were all at home.  I quickly called Alyce to see what had happened.  Well, per my habit, I stack everything that I want to remember onto the living room coffee table.  On this trip, I had two stacks.  One stacks with my airline tickets and ALA badge, on pile with everything else.  Since I was able to get on the plane, it’s probably pretty easy to figure out which pile my wife found still sitting neatly on the coffee table.  Fortunately, I had a copy of my presentation online and on my laptop, so I could fake my way through my talk…but the other information had to be recreated, which made for a long first night in San Antonio.

Saturday, January 21, 2006:
CONTENTdm Success stories
I was speaking at this session and from what I’ve been told, folks were in general, very impresses with our CONTENTdm collections and our use of CONTENTdm.   It was actually an interesting session to give.  I’ve gotten pretty use to giving talks to individuals with a fairly high-level understanding of technical issues.  In this case, it was more folks on the department head, AUL, Library Director level so this audience was more interested (in my opinion) in a little more flash than substance (of course, had Karyle been there, she would have been the exception.  Right Karyle…wink, wink.)  My presentation had two parts.  First I discussed why we chose CONTENTdm (and I was honest in saying that we viewed it then, and to some degree view it now as an interim solution as this are of digitization continues to shake out), our first project and how our workflow has developed over the years.  Over this first part, most in the audience didn’t seem too interested (of course, this was the part of substance).  Then we moved onto the flash and I discussed some of the experimentation that I’ve been doing at OSU, like the EAD interface and Flickr-like interface that I’ve been designing for the folks in Archives.  Folks were particularly interested in the EAD interface…lots of questions and requests for my business card since folks I think will want to take a crack at getting this setup at their own institutions.  As the conference wore on, I had OCLC folks grabbing me in the Exhibition area with questions regarding my EAD implementation because it was all the folks from my session were talking about.

After my session, I spent quite a bit of time down in the vendor booth.  I spent time chatting with OCLC about their new SOAP interfaces and where this is moving and potential for building tie-ins to services (secretly, I’d like to integrate OCLC services directly into MarcEdit so I can stop using OCLC’s cataloging software), I spent time chatting with ExLibris to see where they are going with DigiTool and SFX and even made time to talk with III (though talking to them is so exhausting).  On a side note, one of the things that I’d come away with from this meeting is the need to do is sit down and evaluate why we are using III.  Not with the intention to move away from them (though I think if we took an honest look at some of the services we purchase from III, we may find that they are unnecessary or not meeting our needs) but with the understanding that we spend a lot of money on III and its time to look at what they do for us and what they don’t do for us.  The University of Washington recently finished such an evaluation, and while their evaluation recommended remaining on III (as expected); they have been able to compile a very good talking points list for III and the enhancement process.  I think given our own desire to marginalize…no, wait, that’s not the right word, to homogenize the use of the ILS system (partly as it relates to our other resources) so that it simply becomes one of many resources currently within our libraries technical infrastructure, that we really need to evaluate how III is helping and hindering these efforts.   

Ok, getting back on track, err; wait…one more thing that I’ve found amazing is how many vendors are looking to position their digital content management software as IR solutions.  ExLibris is positioning DigiTools as an IR, III has their IR solution, CONTENTdm wants to be an IR…each vendor I talked to seemed to believe that their IR solution will be the “next big thing? for their revenue stream.  I’m don’t’ quite agree, but I think that Dspace potentially could start losing users if they don’t eventually start moving their software forward.  A lot of tech folks I talked with seemed to believe that Dspace has plateau’d and has started to stagnant.  In fact, this is something that the vendors have privately told me that they see and are counting on.  Dspace can only be as successful as its community.  At MIT, Dspace will be successful because they are committed to its development, at least for local use.  However, within the larger library community, Dspace has been successful because it is a free, out of the box solution.  Very few libraries have the expertise to support a true R&D effort of the software.  Even at OSU, where we have a developer dedicated to Dspace development, R&D is still limited to a very specific type of customization of the Dspace code.  What Dspace needs, in my personal opinion, is a much more modular structure.  In fact, I actually think it needs to be taking out of Java and moved into PHP/Python or even ASPX (since using MONO, you can run ASPX applications on any platform that you can currently run Java).  But that’s just my opinion.  Getting back on topic, given our own reliance on Dspace, I hope that the vendor’s evaluation of the current state of development is not the case.
After chatting with vendors, I went out with Kyle and Joe to a place called Tom’s Ribs.  You knew it was high class when looking at the neon sign of the pig with a knife and a fork licking his lips at the cars driving by.  But the food was pretty good…but large.  My favorite thing on the menu was the family feast.  That would be 2 full racks of baby back ribs, mashed potatoes, a vegetable and a whole chicken fried chicken.  A WHOLE chicken!  Oh, and for those that aren’t aware (don’t worry, I had to ask too), a chicken fried chicken is a chicken that is fried once in light batter and then again in heavy batter.  Kyle tells me that it’s really good, but would probably be a food choice to be avoided if you are looking for something health.  Apparently, the double frying process helps to make this a 2,000-2,500 calorie meal with something like 3-400 grams of fat.  Ugh. 

Sunday, January 22, 2006:
I ended up staying up too late Saturday, so I slept in a bit on Sunday and went to the Alamo.  It was pretty cool…I never realize how little of the Alamo story that we are told in school  The story of the Alamo is really a very rich story with a lot of background.  Interestingly, we really have nothing like it on the west coast.  It was also neat seeing how many people from different states fought at the Alamo.  Only 11 of the 200 or so individuals were Texans…all the rest were native Texan Mexicans and volunteers from the United States (even though the United State refused to formally help the individuals in Texas because they viewed the issue as an internal issue for the Mexican government.
In the late afternoon I attended the ALCTS Publication committee.  From the title, I think that you can guess what we talked about…ALCTS publication stuff.  Liaisons from the different sections of ALCTS presented their reports, and some business was discussed.
Late in the evening, everyone went to the III dissert banquet.  It was the first time I’d gone to something like this…I had a good time.  Cyril crashed the party so I got to chat with him and Lori (from the UO) as well as seeing a whole lot of folks from OSU.  This was where I was also introduced to some of the members of the IUG user group leadership.  Apparently, they are looking for a second webmaster to help Kyle and are looking for someone with a strong PHP and Mysql background, so I chatted with them a bit about the role and what it entailed.  Looks like a bit of fun and given that I use the IUG website, I definitely have a few ideas of some things that could be improved.  Of course, that made Kyle happy.  It would be like Darth Vader joining the Emperor…infer from that what you will. 🙂

Monday, January 23, 2006:
Morning, I went to the LITA open house and then chatted with the TER editor for next year.  I’ve been given the opportunity to be on the editorial board for the next two years and definitely sounds interesting.  It looks like some changes are coming with this publication, and it definitely would be exciting to be a part of that.

Tuesday, January 24, 2006:
Ah, a travel day.  I got to the airport early, really early this time.  My flight left at 4:30, I got here at 12:30.  This of course has given me a lot of time to work on this trip report.  On a side note, when I came up to the counter, there was only one person there that actually knew how to handle paper tickets.  And I’m not sure if it’s because of my well traveled appearance (I’ve definitely gotten a bit scruffy), but I got special attention in the security line as well (yeah).  But it was all good, and with a little luck, all my flights will be on time and I’ll be back at home, tackling my kids around 9 this evening.  Of course, San Antonio doesn’t have free wireless, so I guess if I send this report out tonight before I get home, it will have to be from Salt Lake City.  Oh, and I love flying into Salt Lake.  I wish it was going to sunny when we land – the mountains this time of year are just gorgeous.  Of course, that’s if the weather cooperates…lets home it cooperates.

CONTENTdm and EAD records

By reeset / On / In Digital Libraries

For those of you out there that use CONTENTdm — OSU has been starting to experiement with putting EAD records into CONTENTdm and utilizing the software as our temporary (maybe longterm temporary) solution to making EAD records available electronically to our patrons.  As CONTENTdm 4.0 users know, this version offers the ability to index many of the access point fields in the EAD finding aid.  This allows you to import an EAD record into CONTENTdm, but when you see how its rendered…well, its sad.  So, I’ve been having some fun hacking the NWDA’s ead stylesheet to make it compatible with CONTENTdm, and it really wasn’t all the difficult.  So, I thought I’d let folks have a peek at what we are doing.  I’ve placed one finding aid in a sample collection that utilizes the generated stylesheet.  The image can be found at: Guide to Alice Kidder Evans Photographic Collection.  The transformation process has been setup so that you can either push the processing to your patrons browser (using the built-in XML objects found in all current generation browsers) or can be handled server side using SAXON (or whatever XSLT processor you might have).   It was surprisingly simple to make this work — and once I get this finished, I’ll make sure I post the docs to the CONTENTdm user area (and this blog) so that folks can try this out at their own sites as well.


What happened to the wireless revolution

By reeset / On / In Uncategorized

I’ve a wireless junkie so I’ve been waiting for the so called wireless revolution…always on, always there.  Well, someone must have forgotten to tell most airports.  Generally, I plan my travel arrangements through airports that offer free wireless to passengers simply because I like being able to pass the time doing a little work while I wait.  This is one of the reasons I absolutely LOVE the Portland, Ore. airport.  They’ve had free wireless for a number of years — which definitely makes me want to get to the airport a little bit early.  But then once I leave Portland, reality sets in as it seems like very few airports actually offer their customers free wi-fi.  Denver for example…I’m sitting here waiting for a connecting flight and what do I find…wireless that costs $9.99 for a 24 hour period…great, but I’m only in the airport for about 40 minutes so thanks, but no thanks.

$25 dollars for Wi-Fi at ALA in the convention center…come-on ALA, lets start making sure that Wi-Fi is available at conferences.


Openning the floodgates

By reeset / On / In Family

It was only a matter of time…My 16 month old, Nathan, has taken his time hitting a number of milestones, partly because we have let him. Well, since November I guess, my wife and I have pretty much been insisting that he start to try and talk to us — either using hand signs or through chatter. Well, its worked…he’s learning new words and phrases almost every day and at this point, just won’t be quiet. Over the past month, he’s learned, please, more, all done, its a truck, its a car, dog, ruff, cat, meow, cup, up, special (for his special blanket), etc.
I think we’ve created a little, chatty monster. 🙂


Going to Midwinter….

By reeset / On / In Digital Libraries, Family

What a drive into the Portland Airport this morning.  Geez, this morning we got something like an 1/4 inch of rain so the highways heading out of Independence was flooded all the way to I-5 (not unpassible, but lots of high water).  Then, I get to Portland and because of 2 accidents, am only able to drive 9 miles in about 2 hours.  I finally got the airport about 15 minutes before my flight was scheduled to depart.  Fortunately, they were running about a 1/2 late too.


MarcEdit 5.0 minor update

By reeset / On / In MarcEdit

I posted a new update to MarcEdit 5.0 last night relating to how the application handles invalid MARC records. Generally, most MARC tools just spit the records out or stop processing at the offending record. Well, in MarcEdit 4.6 there was a MARC “healer” function that would essentially attempt to evaluate how the record was invalid and then attempt to build data around it. Since I’ve made the default processing engine in MarcEdit 5.0 very strict (much stricter than 4.6), I also wanted to include this healing feature. Last night, I uploaded a new version that included some additional code to handle invalid records. So what does the program handle correctly?

1) Records where the leader and record length don’t match.
2) Tabs, new lines, invalid characters in the MARC record.
3) Record sets translated in unblocked format — i.e., separated by new lines, tabs (believe it or not, I’ve seen this), spaces, etc.

I’m working on adding code to the program that will also handle records were the directory simply has been corrupted beyond repair or is simply missing. I’ve nearly got it finished…so stay tuned.

As always, the download is found at: MarcEdit50_Setup.exe


Gary Smith (of OCLC’s) response to MARBI Proposal 2006-04, Technique for conversion of Unicode to MARC-8

By reeset / On / In Digital Libraries, MarcEdit, Programming

I was glad to see Gary Smith from OCLC finally post OCLC’s official response regarding the current MARBI proposal regarding the techniques for conversion of Unicode to MARC-8.  For those that haven’t see the proposal, the general gist of the document is that the current recommendation is to have non-transformable characters dropped, replaced by a fill character.  Personally, I was for one of the other options in the report, the generation of NCR (Numeric Character References), like you see in XML, so that translation between Unicode and MARC-8 and MARC-8 to Unicode would be a lossless process — a process that would be lost if a fill character was utilized.  However, Gary sums up a very good reason to give this further thought in his post…he writes that:

 OCLC does not support this proposal.  Our recent experience in dealing
with Unicode data has shown that we require a lossless representation
for our own operations.  We expect that many of our users will have
similar requirements.  The use of a replacement character constitutes a
permanent loss of information.  If we produce and distribute records
containing replacement characters, they will inevitably come back to us
— and to every other system that takes in data from another system —
in a degraded and unrepairable form.

And he’s right…there are a number of toy ILS systems that will continue to require and share data in legacy formats.  Heck, we use Innovative Interfaces here at OSU and our system hasn’t been converted to Unicode (though we could if asked Innovative to do the conversion — however, there are consequences to this decision that we haven’t worked through yet), so its not simply a toy ILS problem at this point.  The fact the OCLC or any system would be ingesting these records at some point would be problematic.  Currently, MarcEdit generates NCR’s for unmappable characters when moving between UTF-8 and MARC-8, however, I’ll eventually support whatever MARBI blesses as the desired technique — so I’ll be keeping an eye on this and attending the discussion at midwinter.