reeset

Sep 272016
 

Please note, MarcEdit’s Automated update tool will notify you of the update, but you will need to manually download the file from: http://marcedit.reeset.net/downloads.  My web host, Bluehost, has made a change to their server configuration that makes no sense, but ultimately, dumps traffic sent from non-web browsers (connections without a user-agent).  Right now, users will get this message when they attempt to download using the automatic update:

update_bluehost_error

I can accommodate the requirements that they have setup now, but it will mean that users will need to do manual downloads for the current update posted 9/27/2016 and the subsequent update — which I’ll try to get out tonight or tomorrow.

I apologize for the inconvenience, but after spending 8 hours yesterday and today wrangling with them and trying to explain what this breaks (because I have some personal tools that this change affects), I’m just not getting anywhere.  Maybe something will magically change, maybe not — but for now I’ll be rewriting the update process to try and protect from these kinds of unannounced changes in the future.

So again, you’ll want to download MarcEdit from http://marcedit.reeset.net/downloads since the automatic update download connection is currently being dumped by my web host.

 Posted by at 1:45 pm
Sep 272016
 

I’ve posted a new set of updates.  The initial set is for Windows and Linux.  I’ll be posting Mac updates later this week.  Here’s the list of changes:

  • Behavior Change — Windows/Linux: Intellisense turned off by default (this is the box that shows up when you start to type a diacritic) for new installs. As more folks use UTF8, this option makes less sense. Will likely make plans to remove it within the next year.
  • Enhancement: Select Extracted Records: UI Updates to the import process.
  • Enhancement: Select Extracted Records: Updates to the batch file query.
  • Behavior Change: Z39.50 Client: Override added to the Z39.50 client to enable the Z39.50 client to override search limits. Beware, overriding this option is potentially problematic.
  • Update: Linked Data Rules File: Rules file updated to add databases for the Japanese Diet library, 880 field processing, and the German National Library.
  • Enhancement: Task Manager: Added a new macro/delimiter. {current_file} will print the current filename if set.
  • Bug Fix: RDA Helper – Abbreviation expansion is failing to process specific fields when config file is changed.
  • Bug Fix: MSXML Engine – In an effort to allow the xsl:strip-whitespace element, I broke this process. The work around has been to use the saxon.net engine. However, I’ll correct this. Information on how you emulate the xsl:strip-whitespace element will be here: http://marcedit.reeset.net/xslt-processing-xslstrip-whitespace-issues
  • Bug Fix: Task Manager Editing – when adding the RDA Helper to a new task, it asks for file paths. This was due to some enhanced validation around files. This didn’t impact any existing tasks.
  • Bug Fix: UI changes – I’m setting default sizes for a number of forms for usability
  • Bug Fix/Enhancement: Open Refine Import – OpenRefine’s release candidate changes the tab delimited output slightly. I’ve added some code to accommodate the changes.
  • Enhancement: MarcEdit Linked Data Platform – adding enhancements to make it easier to add collections and update the rules file
  • Enhancement: MarcEdit Linked Data Platform – updating the rules file to include a number of new endpoints
  • Enhancement: MarcEdit Linked Data Platform – adding new functionality to the rules file to support the recoding of the rules file for UNIMARC.
  • Enhancement: Edit Shortcut – Adding a new edit short cut to find fields missing words
  • Enhancement: XML Platform – making it clearer that you can use either XQuery or XSLT for transformations into MARCXML
  • Enhancement: OAI Harvester – code underneath to update user agent and accommodate content-type requirements on some servers.
  • Enhancement: OCLC API Integration – added code to integrate with the validation. Not sure this makes its way into the interface yet, but code will be there.
  • Enhancement: Saxon.NET version bump
  • Enhancement: SPARQL Explorer – Updating the sparql engine to give me more access to low level data manipulation
  • Enhancement: Autosave option when working in the MarcEditor. Saves every 5 minutes. Will protect against crashes. data

Downloads are available from the downloads page (http://marcedit.reeset.net/downloads):

–tr

 Posted by at 12:10 am
Sep 102016
 

Three significant changes will be coming as part of MarcEdit’s Sunday update.  These impact the MARCEngine, Regular Expression Processing, and the Linked Data Platform.

MARCEngine Changes:

In May 2016, I had to make some changes to the MARCEngine to remove some of the convenience functions that allowed mnemonic processing to occur on UTF8 data or HTML entitles on MARC8 data.  Neither of these should happen, and there were issues that came up related to titles that actually included HTML entitles in the titles.  So, this kind of processing had to be removed.

Over the past few months, I’ve been re-evaluating how these functions use to work, and have been bringing them back.  Sunday will mark the reintroduction of many of these functions (though, not the HTML entity translation when it’s not specifically appropriate).  The upside, is that coupled with the new encoding work, the tool will be able to support more mixed encoding use-cases.

Regular Expression Processing

The most significant changes to the Regular Expression engine processing is how the multi-line processing works in the Replace Function.  To protect users, I’d set up the match any character “.” match all characters but a new line character.  This was done to keep users from accidently deleting data.  However, this meant that when using the multi-line processing option, it really only worked when fields were side by side.  By removing this limitation, users working with the multi-line option will now have full access to the entire record with the Regular Expression processing.  And with that, a word of warning…be careful.  The Multi-line processing is the easiest way to accidently delete bibliographic data through greedy matches.

Additionally, I’ve added an option to the Replace Function dialog that makes it easier to know that MarcEdit has this option.  Right now, users have to know that you need to add a /m to the end of your expression to initiate the multi-line mode.  You can still do that – but for users that don’t know that this is the case, a new option has been added to turn on the Multiline option (see below).

image

The MultiLine Evaluation will be enabled and selectable when the Use regular expressions option is checked.

Linked Data Platform Changes

The Linked Data Platform will be seeing two significant changes.  The first change is occurring in the code to support linking fields like the 880.  This has meant adding a new special processing instruction “linking”, which can now be used to perform reconciliation against data in these fields.  This is particularly important for Asian languages, Arabic languages, Hebrew, etc.

The second change is in the rules file itself.  I’ve profiled the 880 field, as well as a wide range of other collections.

Finally, I’ve added a note to the Main Window that helps users find their rules file for edit, as well as points to the knowledge-base articles and videos explaining the process. (see below)

image

 

Other Changes:

Other changes that will be made in this update:

  1. ISSN Report – tweaked the process due to a bug.
  2. MARCCompare – added a new output type; in addition to the HTML output, there will be a diff file output
  3. Preferences Window/Language – Added a help icon that points to the knowledge-base articles related to fonts and font recommendations.

These changes will be part of the Sunday update, and at this point, look like they will be applicable to all versions of MarcEdit (though, I do have a few UI tweaks that I need to complete on the mac side).

If you have questions, let me know.  Otherwise, these changes will be made available on 9/11/2016 9/12/2016 [evening].

–tr

 Posted by at 9:33 pm
Sep 062016
 

Happy belated Labor Day (to those in the US) and happy Tuesday to everyone else.  On my off day Monday, I did some walking (with the new puppy), a little bit of yard work, and some coding.  Over the past week, I’ve been putting the plumbing into the Mac version of MarcEdit to allow the tool to support the OCLC integrations — and today, I’m putting out the first version.

Couple of notes — the implementation is a little quirky at this point.  The process works much like the Windows/Linux version — once you get your API information from OCLC, you enter the info and validate it in the MarcEdit preferences.  Once validated, the OCLC options will become enabled and you can use the tools.  At this point, the integration work is limited to the OCLC records downloader and holdings processing tools.  Screenshots are below.

Now, the quirks – when you validate, the Get Codes button (to get the location codes), isn’t working for some reason.  I’ll correct this in the next couple days.  The second quirk — after you validate your keys, you’ll need to close MarcEdit and open it again to enable the OCLC menu options.  The menu validation isn’t refreshing after close — again, I’ll fix this in the next couple of days.  However, I wanted to get this out to folks.

Longer term, right now the direct integration between worldcat and the MarcEditor isn’t implemented.  I have most of the code done, but not completed.  Again, over this week, I hope to have time to get this done.

You can get the Mac update from: http://marcedit.reeset.net/downloads

–tr

Screenshots:

Screen Shot 2016-09-06 at 8.34.54 AM

 

Screen Shot 2016-09-06 at 8.39.44 AM

 Posted by at 6:04 am
Sep 012016
 

Posted Sept. 1, this update resolves a couple issues.  Particularly:

Windows/Linux:

* Bug Fix: Custom Field Sorting: Fields without the sort field may drop the LDR.  This has been corrected.
* Bug Fix: OCLC Integration: regression introduced with the engine changes when dealing with diacritics.  This has been corrected.
* Bug Fix: MSI Installer: AUTOUPDATE switch wasn’t being respected.  This has been corrected.
* Enhancement: MARCEngine: Tweaked the transformation code to provide better support for older processing statements.

Mac:

* Bug Fix: Custom Field Sorting: Fields without the sort field may drop the LDR.  This has been corrected.
* Bug Fix: MARCEngine: Regression introduced with the last update that caused one of the streaming functions to lose encoding information.  This has been corrected.
* Enhancement: MARCEngine: Tweaked the transformation code to provide better support for older processing statements.

Special Notes:

I’ll be adding a knowledge-base article, but I updated the windows MSI to fix the admin command-line added to allow administrators to turn off the auto-update feature.  Here’s an example of how this works: >>MarcEdit_Setup64.msi /qn AUTOUPDATE=no

I don’t believe the AUTOUPDATE key is case sensitive – but the documented use pattern is upper-case and what I’ll test against going forward.

Downloads are available via the downloads page: http://marcedit.reeset.net/downloads/

–tr

Aug 282016
 

One of the gaps in the Mac version of MarcEdit has been the lack of a console mode.  This update should correct that.  However, a couple things about how his works…

1) Mac applications are bundles, so in order to run the console program you need to run against the application bundle.  What does this look like?   From the terminal, one would run
>>/Applications/MarcEdit.app/Contents/MacOS/MarcEdit –console

The –console flag initializes the terminal application and prompts for file names.  You can pass the filenames (this must be fully qualified paths at this point) via command-line arguments rather than running in an interactive mode.  For example:
>>/Applications/MarcEdit.app/Contents/MacOS/MarcEdit –s /users/me/Desktop/source.mrc –d /users/me/Desktop/output.mrk –break

The above would break a MARC file into the mnemonic format.  For a full list of console commands, enter:
>>/Applications/MarcEdit.app/Contents/MacOS/MarcEdit –help

In the future, the MarcEdit install program will be setting an environmental variable ($MARCEDIT_PATH) on installation.  At this point, I recommend opening your .bash_profile, and add the following line:
export MARCEDIT_PATH=/Applications/MarcEdit.app/Contents/MacOS

You can get this download from: http://marcedit.reeset.net/downloads 

–tr

Aug 082016
 

I’ve been putting together some information to help a few non-catalogers understand what MarcEdit is an how much it get’s used around the world.  Couple of fun stats for 2015 based only on usage information from users that make use of the automated update tool.

2015 Usage Information:

  • ~3 million unique program executions (times the program checked for an update [was started])
  • 192 unique countries/political regions
  • ~18,000 unique/active users
  • around 105 print/online citations in 2015 related to MarcEdit (combination of Google Scholar, WOS data – though, this wasn’t rigorously scrubbed for dups)

Where is MarcEdit used?

image

Each of these points (saved the couple in the US where Google got a little confused with Georgia, Jersey, etc.), represents an individual country with users working with MarcEdit.  In 2015, the US represented ~55 percent of the unique usage.  That means that 45% of the MarcEdit community is outside of the United States.

–tr

 Posted by at 11:51 am
Jul 112016
 

The past 3 weeks, I’ve been doing a lot of work on MarcEdit.  These initial changes impact just the windows and linux version of MarcEdit.  I’ll be taking some time tomorrow and Wed. to update the Mac version.  The current changes are as follows:

6.2.190
* Enhancement: Language files have been updated
* Enhancement: Command-line tool: -task option added to support tasks being run via the command-line.
* Enhancement: Command-line tool: -clean and -validate options updated to support structure validation.
* Enhancement: Alma integration: Updating version numbers and cleaned up some windowing in the initial release.
* Enhancement: Small update to the validation rules file.
* Enhancement: Update to the linked data rules file around music headings processing.
* Enhancement: Linked Data Platform: collections information has been moved into the configuration file.  This will allow local indexes to be added so long as they support a json return.
* Enhancement: Merge Records — 001 matching now looks at the 035 and included oclc numbers by default.
* Enhancement: MarcEngine: Updated the engine to accommodate invalid data in the ldr.
* Enhancement: MARC SQL Explorer — added an option to allow mysql database to be created as UTF8.
* Enhancement: Handful of odd UI changes.

You can get the update from the downloads page (http://marcedit.reeset.net/downloads) or via the automated update tools.

–tr

 Posted by at 8:39 pm

MarcEdit: Running Additional Validation processes from the command-line

 MarcEdit  Comments Off on MarcEdit: Running Additional Validation processes from the command-line
Jul 112016
 

MarcEdit’s command-line function has always had the ability to run validation tasks against the MarcEdit rules file.  However, the program hasn’t included access to the cleaning functions of the validator.  As of the last update, this has changed.  If the –validate command is invoked without a rules file defined, the program will validate the structure of the data.  If the –clean option is passed, the program will remove invalid structural data from the file. 

Here’s an example of the command:

>> cmarcedit.exe -s “C:\Users\rees\Desktop\CLA_UCB 2016\Data File\sample data\bad_sample_records.mrc” –validate

–tr

 Posted by at 8:19 pm

MarcEdit: Running Tasks via the command-line

 MarcEdit  Comments Off on MarcEdit: Running Tasks via the command-line
Jul 112016
 

MarcEdit’s task list functionality has made doing repetitive tasks in MarcEdit a fairly simple process.  But one limitation has always been that the tasks must be run from within MarcEdit.  Well, that limitation has been lifted.  As of the last update, a new option has been added to the command-line tool: –task.  When run with a path to a task to run, MarcEdit will preform the task from the command-line. 

Here’s an example of a command:

cmarcedit.exe -s “C:\Users\rees\Desktop\withcallnumbers.mrk” -d “C:\Users\rees\Desktop\remote_task.mrk” -task “C:\Users\rees\AppData\Roaming\marcedit\macros\tasksfile-2016_06_17_190213223.txt”

This functionality is only available in the Windows and Linux version of the application.

–tr

 Posted by at 8:15 pm