I hope that this note finds everyone in good spirits. We are in the mist of the holiday season and I hope that everyone that this reaches has had a happy one. If you are like me, the past couple of days have been spent clean up. There are boxes to put away, trees to un-trim, decorations to store away for another year. But one thing has been missing, and that has been my annual Christmas eve update. Hopefully, folks won’t mind it being a little belated this year.
Enhancement: Clustering Tools: Added the ability to extract records via the clustering tooling
Enhancement: Clustering Tools: Added the ability to search within clusters
Enhancement: Linux Build created
Bug Fix: Clustering Tools: Numbering at the top wasn’t always correct
Bug Fix: Task Manager: Processing number count wouldn’t reset when run
Enhancement: Task Broker: Various updates to improve performance and address some outlier formats
Bug Fix: Find/Replace Task Processing: Task Editor was incorrectly always check the conditional option. This shouldn’t affect run, but it was messy.
Enhancement: Copy Field: Added a new field feature
Enhancement: Startup Wizard — added tools to simplify migration of data from MarcEdit 6 to MarcEdit 7
One thing I specifically want to highlight, and that is the presence of a Linux build. I’ve posted a quick video documenting the installation process at: https://www.youtube.com/watch?v=EfoSt0ll8S0. The MarcEdit 7 Linux build is much more self-contained than previous versions, something I’m hoping to do with the MacOS build as well. I’ll tell folks upfront, there are some UI issues with the Linux version – but I’ll keep working to resolve them. However, I’ve had a few folks asking about the tool, so I wanted to make it ready and available.
Throughout this week, I’ll be working on updating the MacOS build (I’ve fallen a little behind, this build may take an extra week to complete (I was targeting Jan. 1, it might slip a few days past) and I’ll say that functionality, I think folks will be happy as it fills in a number of gaps while still integrating the new MarcEdit 7 functionality (including the new clustering tools).
As always, if you have questions, please let me know. Otherwise, I’d like to wish everyone a Happy New Year, filled with joy, love, friendship, and success.
As is my habit – there will be an update coming out around Christmas. And while it won’t be a large update (since MarcEdit 7 was just made available) – I think there will be a couple of new features that will make the changes worth it.
I’m continuing to enhance the clustering functionality – and for the Christmas update, I will be adding the ability to search within the clusters, as well as the ability to extract records from selected clusters (rather than just providing the ability to change the data in the cluster). By allowing the extraction of records within a cluster, this will give users the ability to use the clustering tools to extract record sets, and then run specific reports or perform selected edits against very targeted data.
New Search Functionality:
These two new clustering options should, I hope, give users some additional control over not only how they search for and interact with clustered data within MarcEdit, but also provides some new functionality that continues to enable all catalogers, regardless of their technical background, the ability to utilize the power clustering data can provide.
Copy Field Changes:
One common question that usually involves utilizing a fairly complicated regular expression using MarcEdit’s multi-line replacement mode (which can be terrifying to use for some), the the ability to move fields from the same field group into a new field group. For example, when converting data from a non-MARC metadata format, it might not be possible to setup a process that distinguishes between first and second authors. So, the final transformation may look something like this:
=100 \\$aLast Name, First Name =100 \\$aLast Name2, First Name =245 10$aTitle
In this instance, it would be desirable to be able to move the data from the second 100 field into a 700 field. As noted above, this was previously accomplished with a regular expression. However, this update introduces a new option in the Copy Field Function: Move Field Data.
The Move Field Data option allows users to identify a field group, and then set the positions that shouldn’t be changed. So, in my example, I would set the preserve position to 1, which would then update field #2 (or 3 or 4 or 5, etc.). Currently the tool does not allow you to preserve a range of values, but I may try to flush out that functionality in anticipation of the request, assuming that the process is straightforward. If it’s not, then I’ll wait for the request.
MarcEdit 6 to MarcEdit 7 Migration
I’m working on a set of common questions that have come up, but one of the most common has been related to moving tasks from MarcEdit 6 into MarcEdit 7. By default, the tool attempts to make that transition for you – but in many cases, the process isn’t able to automatically transfer the data. So, I’ve been spending some time and adding this into the initial startup wizard. Now, when you first install MarcEdit 7 – the tool will attempt to determine if you have a copy of MarcEdit 6 installed on your machine. If you do, a new Wizard page will show up to walk you through the data migration process.
The new Wizard page looks like the following:
If you click on Select Data to Migrate
At this point, you can select the classes of data that you want to import into MarcEdit 7. For some users, they might want to pull all the data into MarcEdit 7; while others may just want tasks. Select Export – and then wait for the tool to complete migrating your data.
Linux Version of MarcEdit 7
Finally, on Christmas, I will post a zip file with instructions for running MarcEdit 7 on Linux. I’m still wrapping up the “build” but I’m hoping that this version of MarcEdit 7 will require zero configuration work to make it run – though, I will be updating the ReadMe file to match the new install/run information.
And I think that is mostly it. I may include some additional help information, a couple new videos/documentation pages – and the MacOS version of MarcEdit is still on target for Jan. 1, 2018.
If you have any questions, feel free to let me know.
This past weekend, I spent a bit of time doing some additional work around the task broker. Since the release in Nov. , I’ve gotten a lot of feedback from folks that are doing all kinds of interesting things in tasks. And I’ll be honest, many of these task processes are things that I could never, ever, imagine. These processes have often times, tripped the broker – which evaluates each task prior to running, to determine if the task actually needs to be run, or not. So, I spent a lot of time this weekend with a number of very specific examples, and working on update the task broker to, in reality, identify the outliers, and let them through. This may mean running a few additional cycles that are going to return zero results, but I think that it makes more sense for the broker to pass items through with these edge cases, rather than attempting to accommodate every case (also, some of these cases would be really hard to accommodate). Additionally, within the task broker, I’ve updated the process so that it no long just looks at the task to decide how to process the file. Additionally, the tool is reading the file to process, and based on file size, auto scaling the buffer of records processed on each pass. This way, smaller files are processed closer to 1 record at a time, while larger files are buffered in record groups of 1500.
Anyway, there were a number of changes make this weekend, the full list is below:
Enhancement: Task Broker — added additional preprocessing functions to improve processing (specifically, NOT elements in the Replace task, updates to the Edit Field)
Enhancement: Task Broker — updated the process that selects the by record or by file approach to utilize file size in selecting the record buffer to process.
Enhancement: New Option — added an option to offer preview mode if the file is too large.
Enhancement: Results Window — added an option to turn on word wrapping within the window.
Enhancement: Main Window — More content allowed to be added to the most recent programs run list
** Added a shortcut (CTRL+P) to immediately open the Recent Programs List
Bug Fix: Constant Data Elements — if the source file isn’t present, and error is thrown rather than automatically generating a new file. This has been corrected.
Update: Task Debugger — Updated the task debugger to make stepping through tasks easier.
Bug Fix: Task Processor — commented tasks were sometimes running — this has been corrected in both the MarcEditor and console program.
Enhancement: Status Messages been added to many of the batch edit functions in the MarcEditor to provide more user feedback.
Enhancement: Added check so that if you use “X” in places where the tool allows for field select with “x” or “*”, the selection is case insensitive (that has not been the default, though it’s worked that way in MarcEdit 6 but this was technically not a supported format
Updated Installer: .NET Requirements set to 4.6 as the minimum. This was done because there are a few problems when running only against 4.5.
With MarcEdit 7 out, I’ve turned my focus to completing the MarcEdit MacOS update. Right now, I’m hoping to have this version available by the first of the year. I won’t be doing a long, extended beta, in part, because this version utilizes all the business code written for MarcEdit 7. And like MarcEdit 7, the mac version will be able to be installed with the current Mac build (it won’t replace it) – this way, you can test the new build while continuing to use the previous software.
MarcEdit 7 represents the next generation of the MarcEdit software. And aside from having new features, new options, and better performance – MarcEdit 7 also has its own song. Yes, Jeff Edmunds, a writer and creator of many cataloging songs (which I can’t seem to find on YouTube any longer – which is definitely a shame). I’d asked Jeff at one point why MarcEdit didn’t have a song, so he wrote one. Seriously though, as faculty, researchers, librarians – we sometimes take the work that we do a little too seriously. I like to periodically remind myself that not only am I fortunate to have the opportunity to have a position that affords me the opportunity to do research and contribute to a vibrant community; I have a lot of a fun doing it. And so, like all serious software releases, I present to you, the MarcEdit 7 song introducing MarcEdit 7.
It took less than a week for the first bug to show up. I have some UI changes that I’d like to make over the weekend, but I wanted to take the time to close this particular issue. The first bug was found in the field dedup option in the Add/Delete Field function. This option was rewritten to allow field deletion preference. The issue occurred when some data was left empty. This update corrects that issue, as well as adds one feature that I’ve been interested in having since starting the revisions – window transparency in MarcEditor functions.
So what do I mean by Windows Transparency? When you open MarcEdit 6 or 7 and load a file into the MarcEditor – if you select an option like the Add/Delete field tool – the tool window covers the Editor. Since the Editor is the owner, the tool window needs to be moved to see the data underneath. That bothers me. Here’s what this looks like today:
To get at the data under the window – I have to move the Add/Delete Field window – and if I use a smaller screen (and I do), this can mean moving to the edges of my PC. So, I added a new option to the Ease of Access section in the Preferences. You can enable window transparency, and when a window has an owner (not Modal – there is a difference – messageboxes are modal and stay on-top until some input occurs), the window will become transparent when not active. This allows you to see the underlying data. So, let’s look at this same example with transparency enabled.
Not that I can now see the underlying window data in the MarcEditor. Select the Add/Delete Field box again, and the window becomes active and solid. I can now shift between the two windows without having to move my dialogs, and that makes me happy.
To enable this new function, you simply need to go to the preferences, and select the ease of access section. There you will find the new transparency options.
Hopefully other users will find this feature useful as well.
After 9 months of development, hundreds of thousands of lines of changed code, 3 months of beta testing over which time, tens of millions of records were processed using MarcEdit 7, the tool is finally ready. Will you occasionally run into issues…possibly – any time that this much code has changed, I’d say that there is a distinct possibility. But I believe (hope) that the program has been extensively vetted and is ready to move into production. So, what’s changed? A lot. Here’s a short list of the highlights:
Native Clustering – MarcEdit implements the Levenshtein Distance and Composite Coefficient matching equations to provide built-in clustering functionality. This will let you group fields and perform batch edits across like items. In many ways, it’s a lite-weight implementation of OpenRefine’s clustering functionality designed specifically for MARC data. Already, I’ve used this tool to provide clustering of data sets over 700,000 records. For performance sake, I believe 1 million to 1.5 million records could be processed with acceptable performance using this method.
Smart XML Profiling – A new XML/JSON profiler has been added to MarcEdit that removes the need to know XSLT, XQuery or any other Xlanguage. The tool uses an internal markup language that you create through a GUI based mapper that looks and functions like the Delimited Text Translator. The tool was designed to lower barriers and make data transformations more accessible to users.
Speaking of accessibility, I spent over 3 months researching fonts, sizes, and color options – leading to the development of a new UI engine. This enabled the creation of themes (and theme creator), identification of free fonts (and a way to download them directly and embed fonts for use directly in MarcEdit within the need of administrator rights), and a wide range of other accessibility and keyboard options.
New versions – MarcEdit is now available as 4 downloads. Two which require administrative access and two that can be installed by anyone. This should greatly simplify management of the application.
Tasks have been super charged. Tasks that in MarcEdit 6.x could take close to 8 hours now can process in under 10-20 minutes. New task functions have been added, tasks have been extended, and more functions can be added to tasks.
Link data tools have been expanded. From the new SPARQL tools, to the updated linked data platform, the resource has been updated to support better and faster linked data work. Coming in the near future will be direct support for HDT and linked data fragments.
A new installation wizard was implemented to make installation fun and easier. User follow Hazel, the setup agent, as she guides you through the setup process.
Languages – MarcEdit’s interface has been translated into 26+ languages
.NET Language update – this seems like a small thing, but it enabled many of the design changes
MarcEdit 7 *no* longer supports Windows XP
Consolidated and improved Z39.50/SRU Client
Enhanced COM support, with legacy COM namespaces preserved for backward compatibility
Improved Error Handling and expanded knowledge-base
The new Search box feature to help users find help
With these new updates, I’ve updated the MarcEdit Website and am in the process of bringing new documentation online. Presently, the biggest changes to the website can be seen on the downloads page. Rather than offering users four downloads, the webpage provides a guided user experience. Go to the downloads page, and you will find:
If you want to download the 64-bit version, when the user clicks on the link, the following modal window is presented:
Hopefully this will help users, because I think that for the lion’s share of MarcEdit’s user community, the non-Administrator download is the version that most users should use. This version simplifies program management, sandboxes the application, and can be managed by any user. But the goal of this new downloads page is to make the process of selecting your version of MarcEdit easier to understand and empower users to make the best decision for their needs.
Additionally, as part of the update process, I needed to update the MarcEdit MSI Cleaner. This file was updated to support MarcEdit 7’s GUID keys created on installation. And finally, the program was developed so that it could be installed and used side by side with MarcEdit 6.x. The hope is that users will be able to move to MarcEdit 7 as their schedules allow, while still keeping MarcEdit 6.x until they are comfortable with the process and able to uninstall the application.
Lastly, this update is seeing the largest single creation of new documentation in the application’s history. This will start showing up throughout the week and I continue to wrap up documentation and add new information about the program. This update has been a long-time coming, and I will be posting a number of tid-bits throughout the week as I complete updating the documentation. My hope is that the wait will have been worth it, and that users will find the new version, it’s new features, and the improved performance useful within their workflows.
The MarcEdit MSI cleaner was created to help fix problems that would occasionally happen when using the Windows Installer. Sometimes, problems happen, and when they do, it becomes impossible to install or update MarcEdit. MarcEdit 7, from a programming perspective, is much easier to manage (I’ve removed all data from the GAC (global assembly cache) and limited data outside of the user data space), but things could occur that might cause the program to be unable to be updated. When that happens, this tool can be used to remove the registry keys that are preventing the program from updating/reinstalling.
In working on the update for this tool, there were a couple significant changes made:
I removed the requirement that you had to be an administrator in order to run the tool. You will need to be an administrator to make changes, but I’ve enabled the tool so users can now run the application to see if the cleaner would likely solve their problem.
Updated UI – I updated the UI so that you will know that this tool has been changed to support MarcEdit 7.
I’ve signed the application…it has now been signed with a security certificate and now is identified as a trusted program.
As of 12 am, Nov. 27 – I’ve staged all the content for MarcEdit 7. Technically, if you download the current build from the Release Candidate page – you’ll get the new code. However – there’s a couple things I need to test and finish prepping, so I’m just staging content tonight. Things left to test:
Automated update – I need to make sure that the update mechanism has switched gracefully to the new code-base, and I can’t test that without staging an update. So, that’s what I’m going to do tomorrow. Currently, I have the code running, tomorrow, I’ll update the build number and stage an update for testing purposes.
I need to update the Cleaner program – while MarcEdit is easier to clean when installed only in the user space, the problem is that if an update becomes corrupted, you still have to remove a registry key. Those keys are hard to find, and the cleaner just needs to be updated to automatically find them and remove them when necessary.
I want to update the delivery mechanism on the website. With MarcEdit 7, there are 4 Windows installers – 2 that install without administrative permissions, 2 that do. I’d recommend that users install the versions that do not require administrative permissions, but there may be times when the other version is more appropriate (like if you have more than one users signing in on a machine). I’m working on a mechanism that will enable users to select 32 or 64 bit, and then get the 2 appropriate download links, within information related to which version would be recommended and the use cases each version is designed for.
A new milestone was reached this past weekend, in that the MarcEdit 7 release candidate was posted. Over this next week, I’ll be working on tests, prepping final installation packages, writing documentation, and getting a package together for Linux installation. As I noted, the Mac version of MarcEdit will come later, as there are a number of UI changes that will need to be accommodated due to some differences with the new MacOS install. My guess at this point, I should complete most of the Mac work by Christmas.
Keep an eye out for more information on the final release. At this point, it should happen on Nov. 26th.