MarcEdit update notes

Since a few folks have asked, the update mentioned last week is coming along nicely. At this point, I have a number of MarcEdit users that are testing new features and the install program (both individually and in an enterprise environment). Things are looking good, so it will be out the door as soon as all reports have come in. Some notes again on the update:

Extending the Z39.50/SRU client to provide multi-target searching.
This is a fairly often requested feature, but I’ve always held off adding it as a courtesy to OCLC (a number of years ago, I’d received a request when MarcEdit was set to add a Z39.50 client to keep it simple.) It’s been a long time and times have changed — so I’ll be lifting this restriction slowly. To start, the program will allow searching of 3 databases at once. As I finish rethreading the requesting engine, I’ll likely remove even that restriction.
Improved large UTF-8 file support in MarcEdit:
MarcEdit has what I call a preview mode. It essentially opens a snippet of a large file into an editor so users can preview global changes that they might run on a record. This is how I like to work with large file sets. However, some people like loading the entire file into a viewable space. Now files under 30 MB — these are fairly harmless and easy to load into any traditional editor — but files over 30 MB are difficult to view in Windows. The reason for this is memory. For every byte loaded, it takes ~4 bytes of RAM to view. So, a 30 MB file consumes 120 MB of system memory. A 60 MB file would consume 240 MB of memory, and so on (I’ve oversimplifying this, but you get the general idea). In MarcEdit, the program has always traditionally relied on a customized version of what is called the RichEdit component. This is the same component Microsoft leverages in Wordpad. It provides fairly good UTF8 support, as well as a good deal of formatting support. The downside is speed. When loading lots of data, the component tends to get bogged down, as part of the loading process is converting data not in richtext to richtext on the fly. For smaller documents, the conversion isn’t noticed. For larger documents, its a real time drain. Likewise, charactersets like Hebrew, Arabic and Asian sets cause it all kinds of speed and resource problems. So after sending emails back and forth to Microsoft’s support, I finally came up with a solution — write my own text rendering built upon the general textbox components in Windows. It took a while, but I’ve got it working and early reports from testers is that it’s faster and consumes less memory. Of course, since I spun my own, I also set a hard limit on the file total file size that can loaded into the editor at a half GB. In reality, you’d run out of system memory before you ever were able to load a 1/2 GB, so this seemed more than reasonable. Of course, if you have files larger than a 1/2 GB, you can load them just fine into MarcEdit for editing, so long as you utilize the preview mode.
Installation:
You wouldn’t think that this would be a big deal. I mean, installing a program is one of the basic parts of program design. However, when you are talking about a large number of users or an enterprise system, it gets very complicated. MarcEdit previously utilized an installer that relied on the Nullsoft installer with a number of plug-ins and shims that I’d written myself. It was great for what it did, but poor for large enterprise level installations. However, up until recently, I really didn’t care because most people that use MarcEdit simply download and install it. However, times are changing as more libraries are moving from in-house IT management to utilizing centralized IT management. The downside of this is that desktop management is all done through a central switch — with users losing the ability to install and manage their own software. In the past month, I’ve had two system administrators (one from a very large university system) contact me inquiring about changing the install model. They want to be able to manage the program centrally. Made since, but when you have a user community measured in the tens of thousands, making these types of significant changes take a lot of planning and work because you need to bring everyone along at the same time. There are a lot of gory details involved in making this work — but after many long nights, I finally was able to slay the beast, which is the windows installer, and come up with a package that would be usable. Yeah me! Right now, I’m having users from a wide variety of institutions, that have done various levels of customization, testing the installer within their environments. So far, the only comments received have been cosmetic — which makes me think that this is close.

One additional note. I’ll be participating with a number of folks in the coming year on examining RDA and how it will require tools to change to support the new models. As I work on this and make changes, I’ll try to relay any pertinent information back to the user community.

–TR

Comments

2 responses to “MarcEdit update notes”

Jonathan Rochkind

February 2, 2009

“Of course, if you have files larger than a 1/2 GB, you can load them just fine into MarcEdit for editing, so long as you utilize the preview mode.”

Did you mean as long as you DON’T utilize the preview mode?

Thanks for continuing development on MarcEdit, it’s a great piece of software.
Administrator

February 2, 2009

Johnathan,

No, if you use the preview mode, MarcEdit will only load a snippet (by default, 1 MB) of your file. Using the preview mode to check edits, you can edit files of any size. It’s when you load the entire file into the MarcEditor (turn the preview mode off or force loading of the entire file) — where you run into the hard limit.

–TR