Aug 152014
 

So I missed one – I made some changes to how MarcEdit loads data into the MarcEditor to improve performance, especially on newer equipment, and introduced a bug.  When making multiple global updates, the program may (probably) will lose track of the last page of records.  This slipped through my unit tests because the program was reporting the correct number of changes, but when the program analyzed the for indexing, it was dropping the last page.  Oops.  This was introduced in the update posted at 1 am on Aug. 15th.  I’ve corrected the problem and updated my unit tests so that this type of regression shouldn’t occur again.

One question that did come up to me privately was why make this change to begin with.  Primarily, it was about performance.  MarcEdit’s paging process requires the program to index the records within a MARC block.  In the previous approach, this resulted in a lot of disk reads.  Generally, this isn’t a problem, but I’ve had occasion where folks with lower disk speeds have had performance issues.  This change will correct that.  For files under 50 MB, the program will now read the entire file into memory, and process the data in memory to generate paging.  This is a more memory intensive task (the previous method utilized a small amount of memory, whereas the new process can require allocations of 100-120 MB of system memory for processing) but removes the disk reads which is the largest bottleneck within the process.  The effect of this change is a large performance gain.  On my development system, which has a solid state drive, the improvement loading a 50 MB file is over a second, going from 3.3 seconds to 1.8 seconds.  That’s a pretty significant improvement – especially on a system where disk reads tend to happen very quickly.  On my secondary systems, the improvements are more noticeable.  On an Intel I-5 with a non-solid state drive and 6 GB of RAM, the old process took between 3.7 to 4.1 seconds, while the new method loaded the file between 1.6-1.8 seconds.  And on a tablet with an older Atom processor and 2 GB of RAM, the old process took approximately 22 seconds, while the new only 9 seconds.  These are big gains that I hope users will be able to see and benefit from.

 

Testing Results Old Process

Machine Description File Description Time to Load
I-7 Dell XPS Ultrabook, 8 GB RAM, SSD 45,922 records; 50 MB 1st load: 3.4s
2nd load: 3.3s
3rd load: 3.2s
I-5 Dell Workstation, 6 GB RAM, 7200 rpm HD 45,922 records, 50 MB 1st load: 4.1 s
2nd load: 4.0s
3rd load: 3.8s
Atom 1.5 mhz ACER tablet, 2 GB RAM, SSD 45,922 records; 50 MB 1st load: 27s
2nd load: 22s
3rd load: 23s

 

Testing Results New Process

Machine Description File Description Time to Load Diff
I-7 Dell XPS Ultrabook, 8 GB RAM, SSD 45,922 records; 50 MB 1st load: 1.4s
2nd load: 1.3s
3rd load: 1.3s
(2s)
I-5 Dell Workstation, 6 GB RAM, 7200 rpm HD 45,922 records, 50 MB 1st load: 1.8 s
2nd load: 1.6s
3rd load: 1.6s
(2.3s)
Atom 1.5 mhz ACER tablet, 2 GB RAM, SSD 45,922 records; 50 MB 1st load: 10.1s
2nd load: 9.6s
3rd load: 9.7s
(10-18s)

 

While the new process appears to provide better performance on many different types of systems, I realize that there may be some system variations that not benefit from this new method.  To that end, I’ve added a new configuration option in the MarcEditor Preferences that will allow users to decide to turn off the new paging method.  By default, this option is selected.

image

If you update the program via MarcEdit, the download will be offered automatically the next time you use the program.  Otherwise, you can get the update at: http://marcedit.reeset.net/downloads

 

–TR

 Posted by at 11:20 pm
Aug 012014
 

After spending the last 3 months of work – I’ve posted the latest and greatest version of MarcEdit; MarcEdit 6.0.  This represents a significant change on a number of fronts – first and foremost, a lot of code has been reformatted to provide better support for non-Windows systems.  For the first time, a MarcEdit packaged App file is being provided to give Mac users one click installation of the application.  This means that at this point, the installation process for a Mac is as simple as – Install Mono, then download the MarcEdit app.  Unzip, and run.  For Linux systems, a zip file can be downloaded, which includes the 3 steps needed to run the application. 

No while I have spent a significant amount of time and code working on MarcEdit to make it run better on non-Windows systems, I will say up front that this is still a work in progress.  You will occasionally run into UI issues and very likely run into times when the program crashes.  Additionally, at this time, Z39.50 functionality will not work on the Mac.  This should be temporary as I work on implementing a webservice that will simplify how MarcEdit provides these services and removes a dependency.  At this point, I have all the code for the service, I just need to find a place to host the project.  If you are using MarcEdit on a Mac – I’d love to hear how it’s going.  I’ll likely be spending the next year working to translate the over 150 GUI elements to MonoMac, a native Cocoa implementation of the UI elements – this should solve many of the Mac UI problems and allow MarcEdit to look and feel like a regular Mac application.  However, this is a significant time commitment so before I consider doing this work – I’ll want to get an idea of how widely used the Mac version of the application is.  So, if you are a Mac user and want to continue to see this improve – send me feedback.  Oh, and I’d be remiss if I didn’t thank the folks at the Ohio State University Libraries for making a Mac system available for me to do testing.  Without it, the Mac app bundle simply wouldn’t exist.

Now the changes – this version saw the deletion of over 10,000 lines of code, and the creation of close to 28,000 lines of new code.  There were a handful of new tools, lots of new functions, and a number of bug and UI fixes.  In the end, I closed 47 tickets.  For the full list of changes – please see the below. 

So how do you get the program.  If you are current MarcEdit user, the program will prompt you to update the next time you run the program.  The application will automatically try to migrate your data from MarcEdit 5.x to the new MarcEdit 6.0 directory – and will prompt you if successful.  If not, you can manually transfer the data by selecting the Help menu, About, and Migrate data.  If you are a Mac or Linux user, go to the download page and select the appropriate download. 

For direct links:

MarcEdit 6 is an in place replacement of MarcEdit 5.  You cannot run these two versions on the same system.  For that reason, I’m leaving the last version of MarcEdit 5.9 available just in case users run into issues with 6.0 as everything shakes out.  To use this version, you will have to uninstall MarcEdit 6, and then go to the download page at: http://marcedit.reeset.net/downloads.

–TR

**** Update List ****

Summary Priority Category Project Opened Date OpenedBy Due Date Resolved As
MARC Tools window refresh (3) Minor Performance MarcEdit 11-Jun-14 reeset   Fixed
Hard to find EULA after install (4) Trivial Bug MarcEdit 31-Jul-14 reeset   Fixed
Extract by Material type (3) Major Enhancement MarcEdit 14-Jun-14 reeset   Fixed
Allow tasks to have comments (3) Major Enhancement MarcEdit 14-Jun-14 reeset   Fixed
Task List — Swap Field not saving Process one per field (2) Major Bug MarcEdit 12-Jun-14 reeset   Fixed
UI Issue — Validation (3) Minor Bug MarcEdit 12-Jun-14 reeset   Fixed
Adding multiple fields in the Add/Delete Function (3) Major Enhancement MarcEdit 12-Jun-14 reeset   Fixed
Script Wizard — add/delete field with modifier (3) Major Bug MarcEdit 12-Jun-14 reeset   Fixed
RobertCompare/MARCCompare (3) Major Enhancement MarcEdit 12-Jun-14 reeset   Fixed
RDA Helper (3) Major Bug MarcEdit 11-Jun-14 reeset   Fixed
Task Manager & Task List tool (3) Major Enhancement MarcEdit 14-Jun-14 reeset   Fixed
Create a Mac Installer (3) Major Bug MarcEdit 11-Jun-14 reeset   Fixed
Remove empty field — missing fields that should be removed (3) Major Bug MarcEdit 16-Jul-14 reeset   Fixed
Refresh the Delimited Text Translator (3) Minor Enhancement MarcEdit 11-Jun-14 reeset   Fixed
Add/Delete Field — Find What Field (3) Minor Bug MarcEdit 11-Jun-14 reeset 01-Jul-14 Fixed
Sharing Tasks which refer to other tasks (3) Minor Bug MarcEdit 11-Jun-14 reeset 01-Jul-14 Fixed
RDA Helper — generating extra blank lines (4) Trivial Bug MarcEdit 22-Jul-14 reeset   Fixed
Preview File — show file name being previewed (3) Minor Bug MarcEdit 22-Jul-14 reeset   Fixed
Edit Field Utility — field drop down list (3) Major Bug MarcEdit 30-Jul-14 reeset   Fixed
Edit Field Utility — can’t edit the ldr (3) Major Bug MarcEdit 30-Jul-14 reeset   Fixed
OCLC Integration — validation errors (3) Major Bug MarcEdit 23-Jul-14 reeset   Fixed
On first load, Mac systems not auto detecting font (3) Major Bug MarcEdit ######## reeset   Fixed
Changing Capital Letters with Diacritics to lower case (3) Major Bug MarcEdit 11-Jun-14 reeset   Fixed
Allow the RDA Helper to be part of the Task List (3) Major Bug MarcEdit 11-Jul-14 reeset   Fixed
merge records tool — customizing match points causes merges to not occur (3) Major Bug MarcEdit 19-Jun-14 reeset   Fixed
Koha Integration – unable to authenticate when using self-signed certificates (3) Major Enhancement MarcEdit 19-Jun-14 reeset   Fixed
Removing duplicate subject headings based on indicators (4) Trivial Enhancement MarcEdit 17-Jun-14 reeset   Fixed
Prompt for confirmation of task deletion (4) Trivial Enhancement MarcEdit 17-Jun-14 reeset   Fixed
Mac Fonts (3) Major Bug MarcEdit 17-Jun-14 reeset   Fixed
Ability to import RIS format (3) Major Enhancement MarcEdit 17-Jun-14 reeset   Fixed
Task List: Syncing lists between multiple computers (3) Major Enhancement MarcEdit 14-Jun-14 reeset   Fixed
Merged Records: Error thrown when dealing with records with more than 10 duplicate control numbers (1) Critical Bug MarcEdit 15-Jul-14 reeset   Fixed
Edit Subfield and Swap Field Edits (3) Major Enhancement MarcEdit 14-Jun-14 reeset   Fixed
Find All — Usability Error; won’t jump to first record. (3) Major Bug MarcEdit 11-Jul-14 reeset   Fixed
Validate Temp File Path Before Generating files (3) Minor Bug MarcEdit 20-Jun-14 reeset   Fixed
Generate Control Number — invalid lines (3) Major Bug MarcEdit 10-Jul-14 reeset   Fixed
Plugin Updates — make it easier to track plugin updates (3) Minor Enhancement MarcEdit 20-Jun-14 reeset   Fixed
Task List: Allowing Description to show in Task Manager and Task Editor (3) Minor Enhancement MarcEdit 20-Jun-14 reeset   Fixed
Automated update of plugins (3) Major Bug MarcEdit 20-Jun-14 reeset   Fixed
Add Export to Excel on the OCLC Plugin (3) Major Enhancement MarcEdit 17-Jul-14 reeset   Fixed
Edit field status reporting fields edited not actual modifications (3) Major Bug MarcEdit 17-Jul-14 reeset   Fixed
Ability to Undo a Task (2) Major Enhancement MarcEdit 17-Jul-14 reeset   Fixed
Initial case bug (3) Major Bug MarcEdit 17-Jul-14 reeset   Fixed
 Posted by at 11:27 pm
Oct 312013
 

Ever since having kids, one of my favorite parts of Halloween has been the carving of the Jack-o-lanterns.  Each year, the family picks out different characters, and I try to see what I can do with each. This is our 2014 crop of jack-o-lanterns.

calvin and hobbes
Calvin and Hobbes

sherlock
Sherlock

snow_leopard
Snow Leopard (created from a picture)

–tr

 Posted by at 3:32 pm
Oct 102013
 

I got a reminder from Oregon State University that my old web space will be disabled this weekend.  I’ve since moved all the content to a new location, but be aware that it will likely take the search engines some time to update these redirects since so many people linked to the older URLs. 

I’ve also nearly migrated most of the other content simply to: http://reeset.net

–tr

 Posted by at 1:33 pm
Apr 162013
 

Just a couple of notes.  I’ve been spending quite a bit of time as of late testing the current MONO builds against the current MarcEdit codebase.  It looks like the latest stable version works just fine with the program, while the 3.0.x branch has a few rendering issues (not sure if they are mine or not — I’m looking into it).  Anyway, what this means is that I’m looking at working on trying to simplify the installation process for folks using the software on Linux/Mac systems.  I’m starting with Linux (since I run Linux).  Right now, I’m working on using a tool called: makeself (http://megastep.org/makeself/) which will create a self extracting archive and then allow me to execute a cleanup script to take away a number of the necessary steps during the install process.  Ideally, the steps would be:

  1. Run the install script
  2. Run MarcEdit
           OR
  3. Get a report of missing dependencies

This would remove the need to remember to run the bootloader to clean up and set install paths, as well as get the configuration right for Z39.50 use if it yaz is installed on the system.  It should also give me a method to check dependencies and provide a list of packages needing to be installed to make installation as simple as possible (though at this point, the only dependencies are the mono-core and the System.Windows.Forms libraries — which sometimes is and sometimes isn’t included in specific distributions definition of “core” files).

Once I have this build process defined, I’ll find myself a mac and give MONO’s MacPackager a whirl.  It seems like that might be a simple solution to packaging the program, and  at this point, all the Z39.50 components can be disabled to greatly simplify the process.

Ideally, once I have an installer in place, I should be able to tweak MarcEdit’s automated updater so that Mac/Linux users receive notifications and downloads of the current installers.  From there, they would just need to spawn them within their own specific environments.  Easier — I hope so.  And as always, I’ll continue to provide a plan zip file for download, for those folks in environments that don’t allow the installation of software.

Questions, comments — let me know.

–tr

 Posted by at 10:12 am

Moving domains

 Uncategorized  Comments Off
Mar 242013
 

If you are reading this, then the redirections that I’ve temporarily put into place are working.  I’m currently in the process of moving all of my content off of the oregonstate.edu domain and onto my new personal domain at: reeset.net.  This means everything from my blog: http://blog.reeset.net to MarcEdit: http://marcedit.reeset.net.  It’s been a slow process, and one that I’m still working on – but it’s coming along.

The redirections I’ve put in place should stay live until around June 2013.  I’m pretty sure that’s why my personal OSU web space will be turned off – though all the data should be migrated long before then.  Hopefully, with the new domain, I won’t have to go through this process again if/when I ever change jobs again in the future. 

–TR

 Posted by at 12:28 am
Jan 262013
 

Sunday, I’ll be making an exception to my no ALA rule (well, it’s not a rule, but I’ve yet to find a good reason to get back involved) and will be in Seattle giving a talk as part of the ALCTS CaMMS program.  The focus is around research, and I thought it would give me a chance to talk to folks about the RDA Helper and some of the practical research I’ve been doing with MarcEdit to help librarians support RDA encoding rules in MARC through data mining and automatic data creation.  Not particularly all that sexy when you consider some of the more abstract concepts that others (and myself) have been working on – but with March 2013 approaching, I’m hoping that this practical work with help to make a real difference while we work how to make the more abstract concepts real in our production environments. 

I’ll post the slides tomorrow, but if you are interested in the RDA Helper, you can find out more about it here:

 

If you are in Seattle and want to say hi, drop me a line or you can find me here: http://connect.ala.org/node/195883

–TR

 Posted by at 11:19 am
Nov 302012
 

I occasionally have occasion to build small projects (web and command line) to do various things.  A lot of times, these projects could be done in things like Rails or another framework, but honestly, I don’t need that much overhead so I fall back to using PHP or PERL.  Between the two, I prefer working with PHP and setting it up to work as a traditional shell script.  It’s fairly easy, and here’s how.

First, you need to make sure that PHP CLI is installed.  You can do this by checking the version.
reeset@enyo home]$ php -v
PHP 5.3.3 (cli) (built: Jul  3 2012 16:53:21)
Copyright (c) 1997-2010 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies

Essentially, you are looking for the text, cli in the version string.  If you have it, you are good to go.  If not, you’ll have to hunt down and install it using apt-get for example. 

Once you are set, you just need to hunt down the full path to the php binary.  I find that it often can be found at /usr/bin/php.  You’ll want to confirm that path because you’ll need it later.

Because I want to turn this into more of an executable script, I don’t want to have to run the script using the php command line.  So, I’m going to add a shebang to the first line of the script file.  This was new for me – I hadn’t realized that this was possible in PHP until I was looking through the manual, and for CLI scripts, I definitely prefer this method. 
Example script:
<?
#!/usr/bin/php
<php
      echo “PHP CLI script \n”;
?>

The shebang allows the script to be run without refererncing the PHP executable in the command line.  Now, all you need to do is make the file executable. 
chmod +x yourphpfile.php

With the file set as executable, you can now run the script like any other shell script, i.e.; ./yourphpfile.php

–TR

 Posted by at 11:36 pm
Nov 282012
 

This afternoon, I put the finishing touches on the update/creation components for the ILS direct integration framework in MarcEdit.  As a proof of concept, I’ve continued to demonstrate the processing utilizing Koha’s API.  The Koha API has been abstracted as a separate class so I can push up to github for those interested in a C# implementation. 

I’ve posted the following video to demonstrate the update/creation process utilizing MarcEdit configured for Koha integration. 

–TR

 Posted by at 1:53 am