Well, apparently my info pumpkin was good enough for 2nd place in our pumpkin carving contest at work. They got a much cleaner picture than I originally did, so I’ll post it here. For those of you that don’t speak MARC, the pumpkin is a jack-o-lantern with the MARC record of the Great Pumpkin Charlie Brown. Sad I know — but hey, when you work with metadata all day, some of it rubs off I guess. 🙂
Apparently, search engines are loving wikipedia. Nicholas Carr and others have taken notice recently that Wikipedia is quickly becoming the default source of information for a number of search engines. Well, now there is some informal research to confirm this trend. Carr writes today that:
Well, now we have such a sample, thanks to a student in Slovenia by the name of Jure Cuhalev. In a research project, Cuhalev gathered a random sample of about 1,000 of the 1.4 million topics covered by Wikipedia. He then ran the terms through the Google, Yahoo, and MSN search engines. He found that Wikipedia did in fact appear with remarkable consistency in the upper reaches of search results. On average, the online encyclopedia appeared in the top-ten search results 65% of the time – and 26% of the time it actually had two results in the top ten. (Cuhalev has posted a summary of his findings on his blog, and the full report can be downloaded here.)
While I find this interesting, its not terribly surprising. However, what did get me thinking was the following comment from Cuhalev’s web blog:
Interesting stuff! It’s nice to see our beloved encyclopedia in the results from a lot of searches, as it generally has nice concises data about the subject. Also, I won’t see Encyclopedia Britannica equal this feat any time soon
And you know what, it’s true, at least for part of it. It is very unlikely that we’ll see Encyclopedia Britannica showing up on the top of a web search engine result. For academics, this will be a cause for some concern. While wikipedia does generally serve it’s purpose as a good general information source, the fact is that it isn’t a reviewed resource in the traditional sense. What’s more, within most academic circles (including the classroom), it won’t be treated as an authoritative source of information. Will this be problematic…sure. We already know that most students use [insert popular search engine here] while doing their research — so if Wikipedia is showing up in the top 10 results, its very likely that students will start/continue to use it. And this is where we get the rub. For many, wikipedia will become their default source of encyclopedic information — assigning to it a measure of authority that quite possibility, it has yet to earn. How does that affect the academic community? Personally, I think it affects the professional community very little since this group will tend to gravitate towards more traditional research publications — but the students that they teach…that could be very different. I’d be curious to know on OSU’s campus how most faculty currently treat citations pulled from wikipedia.
Well, maybe not the great pumpkin, but a pumpkin all the same. 🙂 I spent most of Sunday carving pumpkins for my boys and one for me. Kenny’s and Nathan’s pumpkin’s turned out well. Kenny’s pumpkin was a picture of Goldie (or dog) — Nathan’s pumpkin was of Thomas the Tank engine. As you can see from the picture, they had a good time cutting, gutting and cleaning the pumpkins.
So as I mentioned, Kenny and Nathan’s pumpkins turned out great (picture below) — mine on the other hand didn’t. I had a vision of creating a Info Pumpkin — a pumpkin with a MARC record of the Great Pumpkin Charlie Brown on it. I spent about 2 hours tacking out all the characters only to found out that putting 900 characters on a pumpkin really isn’t a simply process. This is what it looked like:
If I put in three candles you can actually make out a lot of the text, but not all of it. Too bad. So instead I ended up carving a hokey jack-o-lantern on the back (I just couldn’t spend another few hours doing another pumpkin). So, here’s our three pumpkins for the year:
Wow — I was reading MSNBC today and found an article from Newsweek on higher education fundraising campaigns. 4 billion dollar campaigns…wow. OSU is looking at doing a fundraising campaign — though our goals are much more modest — I believe the last stated goal was somewhere in the 400 million dollar range. Still lots of money — but somehow it doesn’t roll off the touch like 4 billlliiiionnn. 🙂
Ah, enjoyed the first frosty ride of the morning today. It was 37 degrees at my house and ~31-32 our on the highway (its actually interesting, I’ve found that temperature can vary by almost 10 degrees once you get out into the farmland) making the farm land a very sparkly white. Fortunately, I grabbed some warmer close before heading out this morning, but still was brisk.
I was following a thread today talking about some of the legal wrangling’s related to Google and their Google Books project. The message that made me laugh was a series where someone had commented that Google had long since forgotten their ‘do no evil’ philosophy and have become pure evil. Of which, someone said it was a tie between Microsoft and Google, and then asked the question: What would happen if they merged? The end of civilization as we know it?, which put a smile on my face.
However, it did get me thinking — why do folks view Google in such a positive light? Or, better yet, how did Google convince libraries (large academic libraries) to essentially give away their content for virtually nothing. Well, I should qualify that — Google is spending a great deal of money digitizing library content — but the costs of digitization pale in terms of the total value of the collection itself and the value of the collection in terms of the collection development decisions that went into building a library’s materials. In putting up some capital, Google is able to catch up as a cultural archive on nearly 200+ years of purchasing and collection management decisions and will have surpassed some of our finest academic libraries in terms of content and breath of collection. Not a bad deal for them.
But I’ll digress since that’s a different discussion. I’m really fascinated by Google’s image and how they have been able to maintain their image as a socially conscious company that’s open for integration by others and open as in friendly to open source. However, if you really think about it — that image doesn’t fit reality. Google has in effect cultivated this image of “open” by making available bread crumbs into their systems. Folks have often been able to do very cool things with these bread crumbs (Google Maps, Search API, etc) but these really are only a small part of the Google machine. While Google offers api it also is, without a doubt, one of the most propriety companies that I’ve ever seen. Their answers in fiscal filings are difficult to pin down (that’s just about any large corporation though) and they vigorously (actually, that’s an understatement) guard their search algorithems. Folks should make no mistake — they are a big business and they act like any other business, but they’ve just somehow been able to wrap themselves in a cloak of openness. In fact, I’ve started to wonder if Google isn’t the ultimate leech. Now, leeches aren’t bad things. They’ve used them in medicine for years — but leeches don’t produce anything — and lately, I’ve been starting to wonder what exactly Google has produced. There is their search engine — which while still wildly popular, is no longer my first choice for all types of searches and then their add placement. And while innovative in their time — even these services don’t really produce anything. Outside of that — I can think of a lot of places where Google is taking other people’s content and repackaging it (or using it to sell versions of it) or are swallowing technologies and assimilating them into the collective. Maybe that should be Google’s new motto, “Resistance is futile”, well, maybe not. 😉
Quick update to MarcEdit 5.0. The following has been updated:
Support for .pac files in the Z39.50/SRU client
I think that this works. We don’t use pac files anymore — so lets just say that if the documentation that I have was correct — it should work.
Corrected problem with Find function. (The program wouldn’t ‘find next’ if the first item found was the first character)
EAD, Dublin Core and the OAIMODs stylesheets have been updated
Tab delimited function — users can now insert LDR and 008 into a record. Users are responsible for making sure that the LDR and 008 data is in the correct format — the wizard only imports data — doesn’t validate.
OAI Harvester — the Mods option has been turned on. Its been tested against the American Memory project’s MODs output – so hopefully other implementations will be as regular.
There are other changes made — a couple of things to clean up some older code in the MarcEngine, the Editor, etc.
BTW, the documentation has been finished — I just need to move it into the Content Management System that I’m going to be storing the docs in from now on. I’m guessing this will take a week? — once this is done, I’ll swap the old help for the new and take the program out of beta. At that point, I’ll likely start the process of moving MarcEdit from .NET 1.1 to 2.0 and look at revising the GUI elements of the openning window. One thing that I don’t like about the opening window is the multiple menu items (side menu and menu bar) and that I’d like the Making and breaking functions available from the opening window (since they are used most often). If anyone has suggestions on how this can be cleaned up — give me a holler. I’m going to be asking some folks at OSU that do UI development to give suggestions as well to see if I can’t streamline usability a bit.
Well, I got off to a bit of a slow start today. I stayed with my brother and sister-in-law in Vancouver, WA and had to make the trip across the river back into Portland. The sessions started at 9:30 am, so I took off around 9 figuring that 30 minutes would be plenty of time. However, I was wrong. It took ~45 minutes just to travel the 4 miles on I-5 to get out of Vancouver and into Portland. Final travel time, ~1 1/2 hours. So instead of 9:30, I showed up at 10:30 am, which means I missed the first session of the day entitled: One-stop shopping for journal holdings: the ideal and the reality. Fortunately, we had a lot of folks from OSU at the event, so I’m sure one of our group had an opportunity to take in this session.
The rest of the day, I spent either speaking (2 topics) or preparing to speak. I did two topics. One was on III’s global update functionality, which is an innovative specific application for database maintenance. The second topic continues this recent spat of evangelism that I’ve been participating in regarding the need to require our vendors to provide open apis. The title of my talk was entitled, Being innovative without innovative and I thought went well. I actually recorded the talk, but I’m never sure what I can post and not post (III’s usergroups, both national and regional follow some courtesy rules when dealing with III topics) so I’ll have to see if I can post my talk.
A discussion of why and how the King County Library brought up AquaBrowser. AquaBrowser is an interesting application, but I’m not sure what to think about it to be honest. However, what I did find interesting was how they sync. data between AquaBrowser and III. I have a lot of methods that I use to extract data — but none of them would scale if exporting our catalog on a nightly basis. So what I really enjoyed was hearing about III’s MarcOut tool. Apparently, this tool provides a simplified method for extracting your MARC data. This is a tool I’m unfamiliar with — so I’m going to be spending some time chatting with the help desk to figure out what this tool is and how we can make use of it.
Sion Romaine and Linda Pitts, University of Washington
This session focuses on the implementation of MARC holdings within III and the UW’s process converting their free text holdings into MFHDs. The presentation gave a very quick overview of the MFHD format as well as some information relating to the problems that they have encountered both in moving the free text data as well as dealing with some III quirks with how the holdings information has been rendered.
I actually felt alittle bad attending this session. About 8 months ago, UW had asked if I would be willing to help them do an automatic conversion of these records. At the time, I had the time to work on it and spent time talking with them about various things needed to do to do the conversion automagically. I’ve done this in the past for libraries — but it takes a lot of time to get it done and unfortunately, their desire to start the conversion landed on my busy time (June – August), so I couldn’t dedicate the time to work as closely with them on this as I would have liked.