Because I’ve been doing a lot of work with MarcEdit and plug-ins, I thought I’d post some sample code for anyone interested in how this might work. Essentially, the sample project includes 3 parts — a host application, a set of Interfaces and a Shared library. Making this work requires a couple of important parts.
First, the host application (either the form or class), need to implement the set of interfaces. So for example, if interaction with a form in the hosted application was need, you would configure the form to implement a set of interfaces. This would look like:
public partial class Form1 : Form, HostInterfaces.IHost
This implements the IHost class (link to msdn) — a generic class that allows you to
pass objects between dynamically loaded libraries. .NET includes a IScript interface that allows for scripting functionality as well.
Anyway, the interfaces are simply like delegates — they define the visible functions/methods that will be accessible to a foreign assembly. This is the simpliest file to create. It looks something like this:
Finally, the Dynamic assembly has the ability to work with any function/object within the host application that has been made public through the interface. For this sample project, I’ve shown how to modify a label (on the host application), add a button to a toolbar and respond to click events from that button.
The project is a simple one — but should go a long way towards showing how this works.
As I’d noted previously (http://blog.reeset.net/archives/479), some early testers had found that the Connexion plug-in that I’d written for MarcEdit stripped the 007. I couldn’t originally figure out why — it’s just a control field and their syntax for control fields is pretty straightforward. However, after looking at a few records with 007 records, I could see why. In Connexion, OCLC lets folks code the 007 using delimiters like a normal variable MARC field (when its not) — and they save it as such — using delimiters. For example:
I’ll admit — I have no idea why they went with this format. From my perspective, its clunky. The 007, as a single control field, is fairly easy to parse as it can have up to 13 bytes, with number of bytes specified 0 byte of the data element. In this format, you actually have to create 9 different templates for the different possibilities in order to account for different field lengths, byte combinations and delimiter settings. Honestly, my first impression when looking at this was that its a perfect example of how something so simple can become much more difficult than need be. Personally, I would have been happier had they broke from their MARCXML like syntax for this one field to create an special 007 element. Again, this is something that could have been easily abstracted in the XSLT translation — but to be fair, I don’t think that they figured anyone but OCLC’s connexion team would ever be trying to work with this.
So how I’m solving it? Well, one of the cool things working with XSLT (and .NET in general) is the ability to use extensions to help fill in missing functionality in the XSLT language (in my case, the ms:script extension in the msxml library). Since this transformation isn’t one that I’m really sharing (outside the plug-in), I’m not too worried about its portability. So, what I’ve done is created a number of helper C# functions and embedded them within the xslt document to aid processing. For example,
This is a simple function that I’m using to track the number of elements needed for the processing template. This is because I don’t want to create 9 different XSLT templates for each processing type, so I’m using some embedded C# to simplify the process. On the plus side, using these embedded scripts make the translation process much faster on the .NET side (since .NET compiles xslt to byte code anyway before running any translation process), and this is a technique that I’ve never really had to use before so I was able to get a little practical experience. Still don’t like it though.
Its interesting how this has played out in the video game market this year. When the Playstation 3 came out, my wife and I started to think about getting a gaming console under the guise that it would be for my boys (really, it was for me). The idea though was that the gaming experience would be simple enough for my two boys (5 & 2) to be able to play with me. We tried a number of them out — but in each case, the button combinations needed to be performed on the XBox 360 and the Playstation 3 made the games unplayable for my boys. You simply need bigger hands if you want to play these games and use these controllers. We’d kindof given up our search, expecting that we’d just have to wait a few more years before something like this would be playable for the boys. Then we visited Jeremy house and got to see his Wii.
The Wii, for those that haven’t had a chance to play, it a very different gaming experience. The console sacrifices some of the high end graphics for simple gameplay and fun. You have controllers that are motioned controlled, rather than primarily button controlled. It looked fun, the boys seemed to get it, so we got one.
We’ve had our Wii now for about a month, and I can tell you that Nintendo has a real winner here. I’m actually sitting here right now watching my 2 year old play “boxing”. He’s got fists of fury as he swings his arms around trying to “knock” out the cartoon characters on the screen. Periodically, I’m hearing him telling me that he’s going to “knock his socks off”. Funny to watch. But its a system that he gets. He doesn’t have to click buttons, just swing the controllers. Simple interface. Of course, we see this happening in other corners, yet we still find it surprising. Gaming manufacturers, for example, are scrambling to get titles available for the Wii, in part because they assumed it would be a novelty and were surprised by its popularity. I wasn’t — but I had my own little usability crew showing me exactly why the simplest interface almost always wins.
So, let’s start out with a preface to my comments here. First, it’s a little on the long side. Sorry. I got a bit wordy and occasionally wonder a little bit here and there :). Second — these reflect my opinions and observations. So with that out of the way…
This question comes from two experiences recently. First, at Midwinter in Seattle, a number of OSU folks and myself met with Innovative Interfaces regarding Encore (III’s “next generation” public interface in development) and the difficulty that we have accessing our data in real-time without buying additional software or access to the system (via access to API or in III’s case, access via a special XML Server). The second meeting has been the current eXtensible Catalog meeting here in Rochester where I’ve been talking to a lot of folks that are currently looking at next generation library tools.
Sitting here, listening to the XC project and other projects currently ongoing, I’m more convinced than ever that our public ILS, which was once the library communities most visible public success (i.e., getting our library catalogs online) — has become one of the library communities’ biggest liabilities — an albatross holding back our communities’ ability to innovate. The ILS and how our patrons interact with the ILS shapes their view of the library. The ILS, at least, the part of the system that we show to the public (or would like to show to the public — like web services, etc.) simply has failed to keep up with library patron or the library communities’ needs. The internet and the ways in which our patrons interact with the internet have moved forward — while libraries have not. Our patrons have become a savvy bunch. They work with social systems to create communities of interest — often times, without even realizing it. Users are driving the development and evolution of many services. A perfect example to this has been Google Maps. A service that in and of itself, isn’t too interesting in my opinion. But what is interesting is the way in which the service has embraced user participation. Google maps mashups liter the virtual world — to the point that the service (Google maps) has become a transparent part of the world that the user is creating.
So what does this have to do with libraries? Libraries up to this point simply are not participating in the space that our users currently occupy. Vendors, librarians — we are all trying to play catch-up in this space by brandishing about phrases like “next generation”, though I doubt anyone really knows what that means. During one of my many conversations over the weekend, something that Andrew Pace said really stuck with me. Libraries don’t need a next generation ILS; they need a current generation system. Once we catch-up — then maybe we can start looking at ways to anticipate the needs of our community. But until the library community creates a viable current generation system and catches-up, we will continue to fall further and further behind.
So how do we catch-up? Is it with our vendors? Certainly, I think that there is a path in which this could happen. But it would take a tremendous shift in the current business models utilized by today’s ILS systems, but a shift that needs to occur. Too many ILS systems make it very difficult for libraries to access their data outside of a few very specific points of access. As an Innovative Interfaces library, our access points are limited based on the types of services we are willing to purchase from our vendor. However, I don’t want to turn this specifically into a rant against the current state of ILS systems. I’m not going to throw stones, because I live in a glass house that the library community created and has carefully cultivated to the present. I think to a very large degree, the library community…no, I’ll qualify this, the decision makers within the library community — remember the time when moving to a vendor ILS meant better times for a library. This was before my time — but I still hear decision makers within the library community apprehensive of library initiated development efforts because the community had “gone down that road” before when many organizations spun their own ILS systems and were then forced to maintain them over the long-term. For these folks, moving away from a vendor controlled system would be analogous to going back to the dark ages. The vendor ILS has become a security blanket for libraries — it’s the teddy bear that lets everyone sleep at night because we know that when we wake up, our ILS system will be running and if its not, there’s always someone else to call.
With that said, our ILS vendors certainly aren’t doing libraries any favors. NSIP, SRU/W, OpenSearch, web services — these are just a few standards that libraries could easily accommodate to standardize the flow of information into and out of the ILS, but find little support in the current vendor community. RSS, for example, a simple protocol that now most IlS vendors support in one way or another, took years to finally be developed.
Talking to an ILS vendor, I’d used the analogy that the ILS business closely resembles the PC business of the late 80’s, early 90’s when Microsoft made life difficult for 3rd-partly developers looking to build tools that competed against them. Three anti-trust cases later (US, EU and Korean) and Microsoft is legally binded to produce specific documentation and protocols to allow 3rd-party developers the ability to compete on the same level as Microsoft themselves. At which point, the vendor deftly noted that they have no such requirements, i.e., don’t hold your breath. Until the ILS community is literately forced to provide standard access methods to data within their systems, I don’t foresee a scenario in which this will ever happen — at least in the next 10 years. And why is that? Why wouldn’t the vendor community want to enable the creation of a vibrant user community. I’ll tell you — we are competitors now. The upswing in open source development within libraryland has place the library community in the position of being competitors with our ILS vendors. Dspace, Umlaut, LibraryFind, XC — these projects directly compete against products that our ILS vendors are currently developing or have developed. We are encroaching into their space, and the more we encroach, the more difficult I predict our current systems will become to work with.
A good example could be the Open source development of not one, but two main stream open source ILS products. At this point in time, commercial vendors don’t have to worry about losing customers to open source projects like Koha and Evergreen, but this won’t always be the case. And let me just say, this isn’t a knock against Evergreen or Koha. I love both projects and am particularly infatuated with Evergreen right now — but the simple fact is that libraries have come to rely on our ILS systems (for better or worst) as acquisition systems, serial control systems, ERM systems — and with ILS vendors having little incentive to commoditize these functions. This makes it makes it very difficult for an organization to simply move to or interact with another system. For one, it’s expensive. Fortunately, the industrious folks building Evergreen will get to the point where it will be a viable option and when it does, will the library community respond? I hope so, but I wonder which large ACRL organization will have the courage to let go of their security blanket and make the move — maybe for the second time — to using an institutional supported ILS. But get that first large organization with the courage to switch, and I think you’ll find a critical mass waiting and maybe, just maybe, it will finally breathe some competitive life into what has quickly become a very stale marketplace. Of course, that assumes that the concept of an OPAC will still relevant — but that’s another post I guess.
Anyway, back to the meeting at Rochester. Looking at the projects currently be described, there is an interesting characteristic of nearly all “next generation” opac projects. All involve exporting the data out of their ILS. Did you get that — the software that we are currently spending tens or even hundreds of thousands of dollars to do all kinds of magical things must be cut out of the equation when it comes to developing systems that interact with the public. I think that this is the message that libraries and those making decisions about the ILS within libraries are missing. A quick look around at folks recognized at creating current generation opacs (the list isn’t long) like NCState have one thing in common — the ILS has become more of an inventory management system, providing information relating to an item’s status, while the data itself is being moved outside of the ILS for indexing and display.
What worries me about current solutions being considered (like Endeca) is that they aren’t cheap and will not be available to every library. NCState’s solution, for example, still requires NCState to have their ILS, as well as an Endeca license. XC, an ambitious project with grand goals, may suffer from the same problem. Even if the program is wildly successful and meets all its goals, implementers may still have a hard time selling their institutions on taking on a new project that likely won’t save the organization any money upfront. XP partners will be required to provide money and time while still supporting their vendor systems. What concerns me most about the current path that we are on is the potential to deepen already existing inequities that exist between libraries with funding and libraries without.
But projects like XC, the preconference at Code4lib discussion Solr and Lucene — these are developments that should excite and encourage the library community. As a community — we should continue to cultivate these types of projects and experimentation. In part, because that’s what research organizations do — seek knowledge through research. But also, to encourage the community to take a more active role when it comes to how our systems are developed and interact with our patrons.
Interesting…I’m not a big fan of governments legislating services — particular a service like this in part because there are available options to work around Apple’s Fairplay DRM and allow play of downloaded items on other players. So I’m not sure what to think about this. On the one hand — it would be great if the whole DRM concept could be scrapped — or at least — unified to a single open format. On the other, I find this kind of meddling to be very disturbing on a number of levels, so I guess we’ll see where it goes from here. Link below.
Uh, oh. Looks like the folks at Blackboard should have just let well enough alone. Given a short period of time, a pluthera of prior art was able to quickly be located — enough to cause a re-examination of the patent. See the link:
I always wonder why a company would try to enforce a patent that is questionable at best. The whole process to defend the patent is expensive — and the ill will generated certainly can’t be worth it. Ah, well.
I like to track Nicholas Carr’s blog, Rough Type, partly because it deals with a number of media issues — particularly with Wikipedia. So it was interesting to get his take on Wikipedia’s decision this last weekend to make all its external links, no-follow links. Here’s the post:
I’m giving a short 15 minute presentation at ALA (How Catalogers and System Developers Work Creatively with Metadata: 1:30-3:30 pm at the Convention center) on the process OSU uses to moving ETD data going into Dspace directly into MARC/OCLC and our catalog. Its a pretty simple process. If you want to see how we do it, please attend. Otherwise, I’ll post slides, script and xslt stylesheet following the presentation.
Ouch — on slashdot today, two not so flattering articles surrounding wikipedia. In the first, there are reports of the German version of wikipedia being used as a platform for speading a virus. An interesting idea. Given that folks trust wikipedia, noone seems to think twice about clicking on links that go outside of the tool.
And then the second article — this goes to the trust issue. As wikipedia pushes itself into the mainstream, questions a plagiarism are sure to come up — and they have. Testing 12,000 articles, a researcher found a number of instances (128) of plagiarism within the encyclopedia. See this article here.
I was playing around with Saxon.NET and I like it. Its also very well done (not suprisingly). Benchmarking, I found that its competitive with my custom processor for XSLT 1.0 while providing 2.0 support. I’ll be integrating this into MarcEdit, likely even this weekend, so folks can start using XSLT 2.0 in the application. I just need to figure out exactly how I’ll implement this engine — partly because my custom processor still outperforms when using XSLT 1.0. So I’ll likely mix the two processors — but allow users to define which that they want to use.
Anyway, here’s a quick sample of how you utilize the XSLT processor in C#.
// Create a Processor instance.
Processor processor = new Processor();System.IO.StreamReader reader = new System.IO.StreamReader(@”c:\OREboyscouts.xml”, System.Text.Encoding.UTF8);System.IO.TextWriter stringWriter =
System.IO.TextReader stringReader = new System.IO.StringReader(stringWriter.ToString());System.Xml.XmlTextReader reader2 = new System.Xml.XmlTextReader(stringReader);reader2.XmlResolver =