The MarcEdit 7 Song

By reeset / On / In MarcEdit

MarcEdit 7 represents the next generation of the MarcEdit software. And aside from having new features, new options, and better performance – MarcEdit 7 also has its own song. Yes, Jeff Edmunds, a writer and creator of many cataloging songs (which I can’t seem to find on YouTube any longer – which is definitely a shame). I’d asked Jeff at one point why MarcEdit didn’t have a song, so he wrote one. Seriously though, as faculty, researchers, librarians – we sometimes take the work that we do a little too seriously. I like to periodically remind myself that not only am I fortunate to have the opportunity to have a position that affords me the opportunity to do research and contribute to a vibrant community; I have a lot of a fun doing it. And so, like all serious software releases, I present to you, the MarcEdit 7 song introducing MarcEdit 7.

Welcome to MarcEdit 7 — the MarcEdit Song

Best,

–tr

MarcEdit 7 Update

By reeset / On / In MarcEdit

It took less than a week for the first bug to show up. I have some UI changes that I’d like to make over the weekend, but I wanted to take the time to close this particular issue. The first bug was found in the field dedup option in the Add/Delete Field function. This option was rewritten to allow field deletion preference. The issue occurred when some data was left empty. This update corrects that issue, as well as adds one feature that I’ve been interested in having since starting the revisions – window transparency in MarcEditor functions.

So what do I mean by Windows Transparency? When you open MarcEdit 6 or 7 and load a file into the MarcEditor – if you select an option like the Add/Delete field tool – the tool window covers the Editor. Since the Editor is the owner, the tool window needs to be moved to see the data underneath. That bothers me. Here’s what this looks like today:

To get at the data under the window – I have to move the Add/Delete Field window – and if I use a smaller screen (and I do), this can mean moving to the edges of my PC. So, I added a new option to the Ease of Access section in the Preferences. You can enable window transparency, and when a window has an owner (not Modal – there is a difference – messageboxes are modal and stay on-top until some input occurs), the window will become transparent when not active. This allows you to see the underlying data. So, let’s look at this same example with transparency enabled.

 

Not that I can now see the underlying window data in the MarcEditor. Select the Add/Delete Field box again, and the window becomes active and solid. I can now shift between the two windows without having to move my dialogs, and that makes me happy.

To enable this new function, you simply need to go to the preferences, and select the ease of access section. There you will find the new transparency options.

 

Hopefully other users will find this feature useful as well.

You can download the new update at: http://marcedit.reeset.net/downloads or the program will automatically prompt and download the update for you.

This weekend, I’ll be addressing a couple UI issues I’ve encountered and will likely add a couple features that didn’t make it into the initial release.

Questions, let me know.

–tr

MarcEdit 7 is Here!

By reeset / On / In MarcEdit

After 9 months of development, hundreds of thousands of lines of changed code, 3 months of beta testing over which time, tens of millions of records were processed using MarcEdit 7, the tool is finally ready. Will you occasionally run into issues…possibly – any time that this much code has changed, I’d say that there is a distinct possibility. But I believe (hope) that the program has been extensively vetted and is ready to move into production. So, what’s changed? A lot. Here’s a short list of the highlights:

  • Native Clustering – MarcEdit implements the Levenshtein Distance and Composite Coefficient matching equations to provide built-in clustering functionality. This will let you group fields and perform batch edits across like items. In many ways, it’s a lite-weight implementation of OpenRefine’s clustering functionality designed specifically for MARC data. Already, I’ve used this tool to provide clustering of data sets over 700,000 records. For performance sake, I believe 1 million to 1.5 million records could be processed with acceptable performance using this method.
  • Smart XML Profiling – A new XML/JSON profiler has been added to MarcEdit that removes the need to know XSLT, XQuery or any other Xlanguage. The tool uses an internal markup language that you create through a GUI based mapper that looks and functions like the Delimited Text Translator. The tool was designed to lower barriers and make data transformations more accessible to users.
  • Speaking of accessibility, I spent over 3 months researching fonts, sizes, and color options – leading to the development of a new UI engine. This enabled the creation of themes (and theme creator), identification of free fonts (and a way to download them directly and embed fonts for use directly in MarcEdit within the need of administrator rights), and a wide range of other accessibility and keyboard options.
  • New versions – MarcEdit is now available as 4 downloads. Two which require administrative access and two that can be installed by anyone. This should greatly simplify management of the application.
  • Tasks have been super charged. Tasks that in MarcEdit 6.x could take close to 8 hours now can process in under 10-20 minutes. New task functions have been added, tasks have been extended, and more functions can be added to tasks.
  • Link data tools have been expanded. From the new SPARQL tools, to the updated linked data platform, the resource has been updated to support better and faster linked data work. Coming in the near future will be direct support for HDT and linked data fragments.
  • A new installation wizard was implemented to make installation fun and easier. User follow Hazel, the setup agent, as she guides you through the setup process.
  • Languages – MarcEdit’s interface has been translated into 26+ languages
  • .NET Language update – this seems like a small thing, but it enabled many of the design changes
  • MarcEdit 7 *no* longer supports Windows XP
  • Consolidated and improved Z39.50/SRU Client
  • Enhanced COM support, with legacy COM namespaces preserved for backward compatibility
  • RDA Refinements
  • Improved Error Handling and expanded knowledge-base
  • The new Search box feature to help users find help

With these new updates, I’ve updated the MarcEdit Website and am in the process of bringing new documentation online. Presently, the biggest changes to the website can be seen on the downloads page. Rather than offering users four downloads, the webpage provides a guided user experience. Go to the downloads page, and you will find:

If you want to download the 64-bit version, when the user clicks on the link, the following modal window is presented:

Hopefully this will help users, because I think that for the lion’s share of MarcEdit’s user community, the non-Administrator download is the version that most users should use. This version simplifies program management, sandboxes the application, and can be managed by any user. But the goal of this new downloads page is to make the process of selecting your version of MarcEdit easier to understand and empower users to make the best decision for their needs.

Additionally, as part of the update process, I needed to update the MarcEdit MSI Cleaner. This file was updated to support MarcEdit 7’s GUID keys created on installation. And finally, the program was developed so that it could be installed and used side by side with MarcEdit 6.x. The hope is that users will be able to move to MarcEdit 7 as their schedules allow, while still keeping MarcEdit 6.x until they are comfortable with the process and able to uninstall the application.

Lastly, this update is seeing the largest single creation of new documentation in the application’s history. This will start showing up throughout the week and I continue to wrap up documentation and add new information about the program. This update has been a long-time coming, and I will be posting a number of tid-bits throughout the week as I complete updating the documentation. My hope is that the wait will have been worth it, and that users will find the new version, it’s new features, and the improved performance useful within their workflows.

The new version of MarcEdit can be downloaded from: http://marcedit.reeset.net/downloads

As always, if you have questions, let me know.

–tr

MarcEdit MSI Cleaner Changes for MarcEdit 7

By reeset / On / In MarcEdit

The MarcEdit MSI cleaner was created to help fix problems that would occasionally happen when using the Windows Installer. Sometimes, problems happen, and when they do, it becomes impossible to install or update MarcEdit. MarcEdit 7, from a programming perspective, is much easier to manage (I’ve removed all data from the GAC (global assembly cache) and limited data outside of the user data space), but things could occur that might cause the program to be unable to be updated. When that happens, this tool can be used to remove the registry keys that are preventing the program from updating/reinstalling.

In working on the update for this tool, there were a couple significant changes made:

  1. I removed the requirement that you had to be an administrator in order to run the tool. You will need to be an administrator to make changes, but I’ve enabled the tool so users can now run the application to see if the cleaner would likely solve their problem.
  2. Updated UI – I updated the UI so that you will know that this tool has been changed to support MarcEdit 7.
  3. I’ve signed the application…it has now been signed with a security certificate and now is identified as a trusted program.

I’ve posted information about the update here: https://youtu.be/HLnG8bczypQ.

If you have questions, let me know.

–tr

MarcEdit 7 staging

By reeset / On / In MarcEdit

As of 12 am, Nov. 27 – I’ve staged all the content for MarcEdit 7. Technically, if you download the current build from the Release Candidate page – you’ll get the new code. However – there’s a couple things I need to test and finish prepping, so I’m just staging content tonight. Things left to test:

  1. Automated update – I need to make sure that the update mechanism has switched gracefully to the new code-base, and I can’t test that without staging an update. So, that’s what I’m going to do tomorrow. Currently, I have the code running, tomorrow, I’ll update the build number and stage an update for testing purposes.
  2. I need to update the Cleaner program – while MarcEdit is easier to clean when installed only in the user space, the problem is that if an update becomes corrupted, you still have to remove a registry key. Those keys are hard to find, and the cleaner just needs to be updated to automatically find them and remove them when necessary.
  3. I want to update the delivery mechanism on the website. With MarcEdit 7, there are 4 Windows installers – 2 that install without administrative permissions, 2 that do. I’d recommend that users install the versions that do not require administrative permissions, but there may be times when the other version is more appropriate (like if you have more than one users signing in on a machine). I’m working on a mechanism that will enable users to select 32 or 64 bit, and then get the 2 appropriate download links, within information related to which version would be recommended and the use cases each version is designed for.

Questions, let me know.

–tr

MarcEdit 7: Release Candidate

By reeset / On / In MarcEdit

A new milestone was reached this past weekend, in that the MarcEdit 7 release candidate was posted. Over this next week, I’ll be working on tests, prepping final installation packages, writing documentation, and getting a package together for Linux installation. As I noted, the Mac version of MarcEdit will come later, as there are a number of UI changes that will need to be accommodated due to some differences with the new MacOS install. My guess at this point, I should complete most of the Mac work by Christmas.

Keep an eye out for more information on the final release. At this point, it should happen on Nov. 26th.

-tr

MarcEdit 7 alpha weekly build

By reeset / On / In MarcEdit

Following changes were made:

  • Bug Fix: Export Tab Delimited Records (open file and save file buttons not working)
  • Enhancement: XML Crosswalk wizard — enabled root element processing
  • Bug Fix: XML Crosswalk wizard — some elements not being picked up, all descendants should now be accounted for
  • Bug Fix: Batch Process Function – file collisions in subfolders would result in overwritten results.
  • Enhancement: Batch Processing Function: task processing uses the new task manager
  • Enhancement: Batch Processing Function: tasks can be processed as subdirectories

I had intended to move the program into beta this Sunday, the above issues made me decide to keep it in alpha for one more week while I finish checking legacy forms/code.

Downloads can be retrieved from: http://marcedit.reeset.net/marcedit-7-alphabeta-downloads-page

–tr

MarcEdit Delete Field by Position documentation

By reeset / On / In MarcEdit

I was working through the code and found an option that quite honestly, I didn’t even know existed.  Since I’m creating new documentation for MarcEdit 7, I wanted to pin this somewhere so I wouldn’t forget again.

A number of times on the list, folks will ask if they can delete say the second field in a field group.  Apparently, you can.  In the MarcEditor, select the Add/Delete field tool.  To delete by position, you would enter {#} to denote the position to delete in the find.

Obviously, this is pretty obscure – so in MarcEdit 7, this function is exposed as an option

image

To delete multiple field positions, you just add a comma.  So, say I wanted to delete fields 2-5, I would enter: 2,3,4,5 into the Field Data box and check this option.  One enhancement that I would anticipate a request for is the ability to delete just the last option – this is actually harder than you’d think – in part, because it means I can’t process data as it comes in, but have to buffer it first, then process, and there are some reason why this complicates things due to the structure of the function.  So for now, it’s by direct position.  I’ll look at what it might take to allow for more abstract options (like last).

–tr

MarcEdit 7 weekly build

By reeset / On / In MarcEdit

Issues completed as part of the MarcEdit 7 weekly update  Couple of things to highlight. 

  • * I’ve integrated a check and download of a Unicode Font into the Startup Wizard.  This will enable users to retrieve and install the Noto Fonts set into a private fonts collection for use by the application.
  • * Clustering tools are now available as a stand alone tool
  • * New Translations
  • * Lots of bug fixes

Thanks to all the folks that are downloading the alpha and trying it out.  Most of the bug reports are directly related to user testing.

  1. Enhancement: All processes: Updated Temp file management
  2. Bug Fix: Plugin Manager failing because it’s missing a column for MarcEdit version (note, none of the current plugins will work with MarcEdit 7)
  3. Enhancement: Added new languages for Croatian, Estonian, Indonesian, Hungarian, and Vietnamese
  4. Enhancement: Offer download into private font collection the Noto fonts when no Unicode font is present. This will make the fonts *only* available for use with MarcEdit.
  5. When Editing a task list — could the list not refresh? This occurs when you have a theme defined. 
  6. Bug Fix: Update all the Z39.50/SRU databases (specifically — the lc databases point to the old voyager endpoint that I believe is turned off)
  7. Bug Fix: Working with Saxon, XSLT transformations that link to files with spaces or special characters fail
  8. Bug Fix: Clustering Tool — selecting a top level cluster would include # of records in the cluster, not just the data to copy
  9. Enhancement: Clustering Tools — Add to the Main Window as a stand-alone tool
  10. Bug Fix: On install, the file types are not associated
  11. Enhancement: New Font’s dialog to support private fonts collections
  12. Bug Fix: Fonts not sticking when using the startup wizard
  13. Enhancement: Added Unicode Font download to help
  14. Bug Fix: Z39.50/SRU downloads were only downloading as .mrk formatted data, not as binary MARC. The Tool has been updated to select download type by extension.
  15. Enhancement: Updated the Icon a bit so that it’s not so transparent on the desktop.

Finally, I recorded and uploaded a video demonstrating the new startup wizard options related to the unicode fonts.  Please see: https://youtu.be/7GWZ_UDUf00

The download can be retrieved from the MarcEdit 7 alpha/beta downloads page: http://marcedit.reeset.net/marcedit-7-alphabeta-downloads-page

Questions, let me know.

–tr

Saxon.NET and local file paths with special characters and spaces

By reeset / On / In C#

I thought I’d post this here in case this can help other folks.  One of the parsers that I like to use is Saxon.Net, but within the .net platform at least, it has problems doing XSLT or XQuery transformations when the files in question have paths with special characters or spaces (or if they reference files via xsl:include statements that live inside paths with special characters or spaces).  The question comes up a lot on the Saxon support site and it sounds like Saxon is actually processing the data correctly.  Saxon is expecting valid URIs, and a URI can’t have a spaces.  Internally, the URI is escaped, but when you process those escaped paths against a local file system, accessing the file will fail.  So, what do I mean – here are two different types of problems I encounter:

  • Path 1: c:\myfile\C#\folder1\test.xsl
  • Path2: c:\myfile\C#\folder 1\test.xsl

When setting up a transformation using Saxon, you setup a XSLTransform.  You can set this up using either a stream, like an XMLReader, or a URI.  But here the problem.  If you create the statement like this:

System.Xml.XmlReader xstream = System.Xml.XmlReader.Create(filepath);
transformer = xsltCompiler.Compile(xstream).Load();

The program can read Path 1, but will always fail on Path 2, and will fail on Path 1 if it includes secondary data.  If rather than using a stream, I use a URI class like:

transformer = xsltCompiler.Compile(new Uri(sXSLT, UriKind.Absolute)).Load();

Both Path’s will break.  On the Saxon list, there was a suggestion to create a sealed class, and to wrap the URI in that class.  So, you’d end up code that looked more like:

transformer = xsltCompiler.Compile(new SaxonUri(new Uri(sXSLT, UriKind.Absolute))).Load();

public sealed class SaxonUri : Uri
    {
        public SaxonUri(Uri wrappedUri)
            : base(GetUriString(wrappedUri), GetUriKind(wrappedUri))
        {
        }
        private static string GetUriString(Uri wrappedUri, bool localuri = false)
        {
            if (wrappedUri == null)
                throw new ArgumentNullException("wrappedUri", "wrappedUri is null.");            
            if (wrappedUri.IsAbsoluteUri) 
                return wrappedUri.AbsoluteUri;
            return wrappedUri.OriginalString;
        }
        private static UriKind GetUriKind(Uri wrappedUri)
        {
            if (wrappedUri == null)
                throw new ArgumentNullException("wrappedUri", "wrappedUri is null.");
            if (wrappedUri.IsAbsoluteUri)
                return UriKind.Absolute;
            return UriKind.Relative;
        }
        public override string ToString()
        {
            if (IsWellFormedOriginalString())
                return OriginalString;
            else if (IsAbsoluteUri)
                return AbsoluteUri;
            return base.ToString();
        }
    }

And this get’s a closer.  Using this syntax, Path 1 doesn’t work, but Path 2 will.  So, you could use an if…then statement to look for spaces in the XSLT file path, and if there are no spaces, open the stream, and if there are, wrap the URI.  Unfortunately, that doesn’t work either – because if you include a reference (like xsl:include) in your XSLT, Path 1 and Path 2 fail, because internally, the BaseURI is set to an escaped version of the URI, and Windows will fail to locate the string.  At which point, you end up feeling like you might be pretty much screwed, but there are still other options but they take more work.  In my case, the solution that I adopted was to create a custom XmlResolver.  This allows me to handle all the URI processing myself, and in the case of the two path statements, I’m interested in handling all local file URIs.  So how does that work:

xsltCompiler.XmlResolver = new CustomeResolver();
transformer = xsltCompiler.Compile(new Uri(sXSLT, UriKind.Absolute)).Load();

internal class CustomeResolver : XmlUrlResolver
    {
        
        public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
        {
            if (absoluteUri.IsFile)
            {
                string filename = absoluteUri.LocalPath;
                if (System.IO.File.Exists(filename)==false) {
                    filename = Uri.UnescapeDataString(filename);
                    if (System.IO.File.Exists(filename)==false)
                    {
                        return (System.IO.Stream)base.GetEntity(absoluteUri, role, ofObjectToReturn);
                    } else
                    {
                        System.IO.Stream myStream = new System.IO.FileStream(filename, System.IO.FileMode.Open);
                        return myStream;
                    }
                } else
                {
                    return (System.IO.Stream)base.GetEntity(absoluteUri, role, ofObjectToReturn);
                }
            }
            else
            {

                return (System.IO.Stream) base.GetEntity(absoluteUri, role, ofObjectToReturn);
            }
        }

By creating your own XmlResolver, you can fix the URI problems and allow Saxon to process both use cases above.

–tr