As I’d posted earlier, Kenny (my oldest) has really been looking forward to this launch. He’s been watching all the videos, soaking in the NASA channel and pretty much learning more about shuttle program that I do. Anyway, I had to ride my bike in this morning — so I didn’t get to stay home with Kenny during the launch…so I asked my wife what he thought. Here was the reply:
Kenny did love it. He’s been watching the replays on the Nasa channel.
They’ve been showing all the different camera angles so we’ve been recording that too.
Kenny says, “Launching, blasting and flying into space!”
He joined in on the countdown for the 4-1 part. He didn’t realize at first they were really counting down.
He’s just loving the replays. He thought at first that they were launching more shuttles. And when he looked outside at the moon there was a plane flying by – a big one leaving a trail and he thought for sure it was the shuttle.
He just said, “Mom – they said the ‘Kenny’ space center!”
It sounds like he enjoyed the launch and I’m sure that over the next 12 days, we will be spending a lot of time on the NASA channel seeing what the astronauts are up to at the space station.
I’ve confirmed a bug in the MarcEngine that translates MARC data to MARCXML. The issue occurs when a subfield code is incorrectly added to the end of a field. I’ve corrected this issue and reposted the build to: marcedit50_2005_07_25.zip
Well, OSU has finally been able to get their openurl resolver added to Google’s Scholar. (click on the image for full size)
Its been a long time coming — we originally tried working with Google on their pilot project before they had made the service more public…but we use Innovate Interface’s Webbridge and there had been some incompatibilities between the two projects. III has since worked to iron them out so that we could participate and as you can see from the picture, there we are…
However, I wonder quite often if this is necessarily a good thing. I don’t mind leveraging Google — what concerns me most is three things:
1) as a community, I wonder if we are not giving too much away to Google for free. Google has done and is doing some very impressive things — but in large part, many of these impressive services will be build on the backs of the library community — either in terms of infrastructure (google scholar) or labor (Google prints). Surely, we will see some benefit but I wonder if the benefits that are communities receive will be equal to the benefit that Google is getting from Libraries.
2) giving Google the library stamp of approval. Libraries and library vendors have a number of projects that are specifically developed to work with Google and to allow Google to work with their services. In some sense, the library community is pushing its own users to Google and then wondering why they are using Google instead of their services…we’ve basically told users that Google’s index is authortative (in so many words) — or at least that’s the impression that I think one would probably get when looking at how quickly the library community is falling over itself to handover services to Google.
3) up until a few years ago, privacy advocates, etc. were very pro-Google. Within a few years, this has changed. Google is becoming the “evil empire”. How did that shift happen? Its not a question that I can answer but also is something that has given me some pause as of late.
Ahhhhh…..I’d introduced a bug in the last build that primarily affected European diacritics when moving between MARC-8 to UTF-8. The issue at hand was how to deal with dangling or incorrectly coded diacritics. In coding a solution, I accidently removed some code that moved combining diacritics. In MARC-8, diacritic placement for most European diacritics should be: [diacritic][character being modified]. When moved into UTF-8, the characterset sequence changes to: [character-being modified][diacritic]. Well, in writing code to capture incomplete diacritics, I’d funzed up the logic that moves the characters (internally, I was clearing a tracking variable when I shouldn’t). Anyway, that should be corrected. The updated download can be found at: marcedit50_2005_07_25.zip.
With that said, there are two charactersets that need to be tested to make sure that they are working as documented. Both the Extended Arabic and Greek charactersets have combining characters. I’ve added the logic to the UTF-8 to MARC-8 converter to accommedate these combinations — but I need to find some sample records to make sure that this is working like I’m intending.
Z39.50 — just about finished. I just need to finish converting my Z39.50 database into XML and adding support for multiple connection formats in the metadata.
What a kick. Its been a long time since my oldest son (4) started to learn how to talk. Kenny, my oldest, was an early talker — which we figured meant that he had a lot of different stuff say. Well, Nathan (10 months) has finally taken an interest in trying to express himself. He’s been able to say momma and dadda for quite some time — but very recently, he’s added babba (bottle), bweeze (please) and as of yesterday, bye, bye. The last one is hillarious. He’ll sit on the floor, watching his hand as he waves to himself saying, bye bye. I got a great laugh. But I think this is just the calm before the storm. He’s very animated and right now seems very determined to let us know what he’s thinking. I’m sure its only a matter of time before Kenny and Nathan will be able to talk to each other and start plotting against us :).
On an unrelated note, I know that NASA has at least one new fan — Kenny has been playing space ship and waiting patiently for the Discovery space launch. He was almost in tears the last time that they scrubbed the mission. Hopefully this time he’ll get to watch the shuttle take off.
Oregon State University has recently started what I’m hoping will be a very successful program — a Digitization on Demand service. The idea is to allow subject selects to select important materials in our collection for digitization and then we will make them available via a collection in our IR. Right now we are still working out kinks in workflow — but if the first three weeks are any indication, this will be a popular program.
Well, after feeling utterly embarrassed by my weak climb up Mary’s Peak, I’ve decided to start a little training program to get myself into climbing shape. While I don’t have anything as steep as Mary’s Peak to train on regularly, I did find what would be a “mini” Mary’s Peak on SoapBox road — which is on my ride to work. While I can’t incorrporate this into my normal daily rides, I can start making this section part of my weekend road ride. I’m hoping that doing this a few times will get my legs ready for a rim ride around Crater Lake.
When I first started putting this version of MarcEdit together, I had a number of goals. Some of the most prominent were:
1) Improved UTF-8 support to remove adoption boundaries for libraries
2) Provide UTF-8 to MARC-8 support to allow for seamless movement of data between XML and MARC sources.
The first was easy — the second has been a pain in the backside. There are a number of issues with providing a round trip back from UTF-8 to MARC-8, some of which are being discussed on LC’s UNICODE MARC listserv. However, for me, the two biggest issues deal with how characters can be represented in UTF-8 and how characters are mapped into UTF-8 when utilizing the MARC-8 to UTF-8 conversion specs. The bane of my existence, at least recently, has been the Latin-1 characterset. When moving data from MARC-8 to Unicode, the LC spec. recommends that these characters be created as composite characters. So if you have a small e with a grave, the Unicode equivalent would be: e+[U+0300]. The nice thing about this syntax is that its easy to bring back into MARC-8. The modified character is still seperate from the diacritic and the diacritic can then be mapped back to its MARC equivalent. However, when typing Latin-1 unicode characters, an e with a grave is represented by character 0xE8. This is much more difficult to break down since there is a single character with must be split into the two corresponding bytes and in correct MARC-8 encoding (which isn’t the same as plain ANSI). I’ll admit it — this has caused me some problems. If you’ve been tracking the developments, over the past week, I’ve posted a number of refreshed builded specifically with the intention of fixing various issues relating to the crossalking of UTF-8 CJK charactersets and latin1 charactersets. Well, after a lot of work, I think that I’ve finally got it nailed down (knock on wood). I’ve tested this build against 6 different types of files.
1) Plain ASCII, no diacritics
3) Mix of Latin-1 and Cyrillic
4) XML encoded UTF-8 with Latin-1 and CJK
5) CJK, Greek and Latin1
6) Arabic and Latin1
In each test, it appears that the data is being handled correctly. There are only a few characters left in the latin-1 code range that need to have support added — these are superscript 0-3. For some reason, the Unicode group has left these in the bottom 256 characters of the spec. The difficulty is that in MARC-8, these are special characters that require special handling. I’ll work on adding these to the next build — which barring the discovering of a bug in the MARCEngine component, will occur on Sunday or Monday and I’m hoping, will include a first look at the new Z39.50 client.