MarcEdit 7 Update: Regular Expression Store, Thread Pooling, and Task Manager Updates

I’ve posted a new MarcEdit 7 update. This includes the following changes:

  • Verify URLs — there is an option to manage # of threads used. This is the first time this tool will utilize a thread pool to provide faster query. I wouldn’t recommend using more than 10 threads (3-5 is a good number) as you could start to look like a denial of service attack to those you are checking.
  • Verify URLs — I’ve updated the stylesheet that generates the results. It’s easier to read, and the record #s are now able to be copied so you could put these in a file and select these files through the Extract Selected Records Tool.
  • Build Links — I’m still experimenting with this, but I’m making it available because it works. This tool now uses threads to build links. I’m generally seeing a 15-20% improvement in speed to process files. Not a big change, but I’m happy for it. I have some ideas I’m working on to improve speed further — they might show up in tonight’s release
  • Regular Expression Store — it’s been updated. You can now add new metadata and search across multiple fields of metadata.
  • Replace Function: When using external file criteria, the tool has trouble if your files include BOM values. I’ve added code to filter these out.
  • Task Management — I’ve added an option in the Task manager to allow a task to override the broker’s assessment and run the tasks using the older task method. Generally, the brokers assessment results in significant speed gains, but there are times when you have a task that will touch every record — it may be faster to use the other method. This will give users control to use either.
  • Component updates: I’ve updated core components to the linked data tooling, the saxon xslt processor, and the JSON processing tools.

 

Regular Expression Store changes:

 

One of the new features added to MarcEdit 7 is the Regular Expression Store. I’ve significantly enhanced this feature. Users can now add significant metadata around their stored expressions, as well as search for these resources by metadata or expression. This update also includes all the client-side work necessary to enable public sharing of expressions…I’m currently putting together the server-side components to make this a reality. I’ll start by putting my library of expressions into the public share. Hopefully, folks will find these useful.

New Expression window:

Notice the Actions button — this is a drop down action button that provides access to options to create new expressions or to save an expression.

Thread Pooling

One of the new enhancements in MarcEdit 7 is the introduction of a thread pool. This has been implemented in the Verify URLs tool and the Build Links tool. The Verify URLs tool provides and interface for users wanting to customize the number of threads used to check urls:

 

The default is set to 3 — but users can make this as small or large as they like. I would caution however, of setting this value above 10, as then the resource will start to look like a denial of service attack if you are querying the same domain over and over.

Task Management Changes:

The last change I want to highlight is in the Task Manager. Occasionally, folks will provide me with files and tasks that the task broker has difficulty profiling. In these cases, the new profiled process may be slightly slower than the older by record approach. To give users more control over the process, I’ve added the ability to override the task brokers recommendations and push data through the legacy task processing method. You’ll see this option on the task editor window.

 

Please note, if a task list is embedded into another task list, the tool will respect the override request of any of the combined tasks, and use that option to process all items in the list.

The update is available at the MarcEdit website: http://marcedit.reeset.net/downloads or via the automatic update mechanism. If you have questions, let me know.

–tr


Posted

in

by

Tags: