MarcEdit Updates

By reeset / On / In MarcEdit

I’ve been working on a few updates the past couple weeks, the most time consuming of these being updates related to Alma.  Here’s the full list:

Windows/Linux:
SourceURL:http://marcedit.reeset.net/software/update.txt

6.2.501
* Enhancement: OpenRefine Import/Export Formatting Updates [tested on 2.7 rc 1 & 2]
* Enhancement: MarcEditor -- Right-to-Left and Left-to-Right language improvements when reading in the Editor [see: http://blog.reeset.net/?p=2103 for more info]


6.2.500
* Enhancement: ILS Integrations (Alma): Alma Holdings Editing is now available.
* Enhancement: Validator Updates - added some code to tighten the duplication reporting and added better responsiveness when running [i.e., statuses, etc.])
* Enhancement/Bug Fix: MarcEngine Updates: Specifically, in the XML space, I was running into some Chinese characters that were not being recognized as UTF8.  This can happen for characters on the fringes of a characterset or have alternative characters.  I made some edits that ensured that any character checking would fall through to a secondary process.  
* Enhancement: Task Management -- I cleaned up a few odds and ends to make managing a bit easier.  This includes exposing error messages when they popup, enabling multiple task deletion, etc.
* Enhancement: Task Management Preferences -- This came up where MarcEdit was having trouble identifying a file path as valid.  I'm going to check these paths as soon as they are selected and report if they are valid right away.  This way -- if there is a problem, you know right away.

MacOS:
SourceURL:http://marcedit.reeset.net/software/mac.txt

2.3.5
**************************************************
** 2.3.5
**************************************************
* Enhancement: ILS Integrations (Alma): Alma Holdings Editing is now available.
* Enhancement: Validator Updates - added some code to tighten the duplication reporting and added better responsiveness when running [i.e., statuses, etc.])
* Enhancement/Bug Fix: MarcEngine Updates: Specifically, in the XML space, I was running into some Chinese characters that were not being recognized as UTF8.  This can happen for characters on the fringes of a characterset or have alternative characters.  I made some edits that ensured that any character checking would fall through to a secondary process.  
* Enhancement: Task Management -- I cleaned up a few odds and ends to make managing a bit easier.  This includes exposing error messages when they popup, enabling multiple task deletion, etc.
* Enhancement: Task Management Preferences -- This came up where MarcEdit was having trouble identifying a file path as valid.  I'm going to check these paths as soon as they are selected and report if they are valid right away.  This way -- if there is a problem, you know right away.
* Enhancement: OpenRefine Import/Export Formatting Updates [tested on 2.7 rc 1 & 2]
* Enhancement: MarcEditor -- Right-to-Left and Left-to-Right language improvements when reading in the Editor [see: http://blog.reeset.net/?p=2103 for more info]

I believe that there will need to be a couple additional updates around the alma work — particularly when creating new holdings, as I’m not seeing the 001/004 pairs added to the records, but this is the start of that work.   Also, I did some work around right-to-left, left-to-right rendering when working with mixed languages.  See: http://blog.reeset.net/archives/2103

Questions, let me know.

–tr

 

MarcEditor Changes and Right-to-left displays

By reeset / On / In character encodings, MarcEdit

So, this was a tough one.  MarcEdit has a right-to-left data entry mode that was created primary for users that are creating bibliographic records primarily in a right to left language.  But what happens when you are mixing data in a record from left-to-right languages like English and Right-to-left languages, like Hebrew.  Well, in the display, odd things happen.  This is because of what the operating system does when rendering the data.  The operating system assumes certain data belongs to the right-to-left string, and then moves data in a way that it think it should render.  Here’s an example:

In this example, the $a$0 are displayed side-by-side, but this is just a display issue.  Underneath, the data is really correct.  If you compiled this data or loaded into an ILS, the data would parse correctly (though, how it displayed would be up to the ILS support of the language).  But this is confusing, but unfortunately, one of the challenges of working with records in a notepad-like environment.

Now, there is a solution that can solve the display problem.  There are two Unicode characters 0x200E and 0x200F — these are Left-to-right character markers and Right-to-left character markers.  These can be embedded in the display to render characters more appropriately.  They only show up in the display (i.e. are added when reading into the display), and are not preserved in the MARC record.  They help to alleviate some of these problems.

 

The way that this works — when the program identifies that its working with UTF8 data, the program will screen the text for characters have a byte that indicate that they should be rendered RTL.  The program will then embed a RTL marker at the beginning of the string and a LTR marker at the end of the string.  This gives the operating system instructions as to how to render the data, and I believe helps to solve this issue.

–tr