MarcEdit 5.9 update

 MarcEdit  Comments Off
Dec 242012
 

[Please note, it appears that the OSU network is a little wonky right now – so if for some reason you have trouble downloading the update, try again at a later period.  Unfortunately, due to the holidays, the people.oregonstate.edu domain will continue to be up and down until after Christmas.]

This MarcEdit update spans a lot of different areas of the program, but largely represents refinements to specific functions.  Also, with the end of the year, I’m taking this opportunity to move the MarcEdit version number from 5.8 to 5.9. 

Updates:

  • Enhancement: Jump to Record in the Find All now takes you to the search for string in the record – just just the head of the record.
  • Enhancement: Koha Direct Integration: From search you can now download multiple selected records or all records (sorry for the oversight there)
  • Enhancement: Koha Direct Integration: A prompt now occurs on update/create that asks if Item data should be parsed.  By default, the Koha API ignores this information, but if you indicate you want to edit item data, it will make the necessary call.  This function only works in Koha 3.8+ (due to api limitations)
  • Enhancement:  Edit Shortcut/Find Missing Field – I added an option that will allow you to choose the level of granularity when reporting fields missing specific data.  In the past, if you asked for records missing, say, a 650$z – it would return records where the 650 field was missing the $z as well as records missing a 650.  A new option will allow you to scope the search so that it includes all records missing data (as it currently does), or limit the scope so that only records with the specified field present, but missing the specified subfield are counted and displayed.
  • Bug Fix:  MarcEditor – the keyboard shortcut between file and font overlapped, so that was corrected.
  • RDA Helper – Enhancement – phonogram imprint marks found in the 260 will now be moved to the 264 when present.
  • RDA Helper: Bug fix – When removing the GMD from a record, some of the ISBD punctuation could be dropped if trailing the $h directly.  This has been corrected.
  • RDA Helper:  Bug Fix – When creating 336 tags and content should be marked as spoken word, the program was incorrectly printing spoken work.  This has been corrected.
  • RDA Helper: Bug Fix – when creating a 344, correct misspelling of the word analog which would occur under certain conditions.
  • RDA Helper – Bug Fix – Added stereo as an output in the 344$g when 007[4] equals ‘s’.
  • RDA Helper – Bug Fix – When processing 260 fields where the data is in brackets, like [1961, c1960] – some of the punctuation isn’t cleaned up properly.  This is corrected.
  • Bug Fix – When I was simplifying the main screen interface, I had removed access to the tutorials.  I added the link to the tutorials back under the Help Menu entry item in the main Windows.
  • Bug Fix – When generating the tutorials listing, MarcEdit utilizes an XSLT to create the list from a generated XML file.  On newer (after XP) systems, if Internet Explorer was your primary browser, the security settings would often prevent the XSLT transformation from occurring.  I modified the conversion process so this is no longer going to be an issue.
  • Bug Fix:  Add/Delete Field, Add If Present option.  This option was designed so that when selected, the user could conditionally add new fields if specific data was found in a record.  However, an error in the code was preventing this option from functioning as designed.  This has been corrected.
  • Enhancement:  New MarcEdit API – BatchZ3950SearchEx.  Returns results as an array of objects, rather than in a file.  You can read more about it here: http://blog.reeset.net/archives/1139
  • Enhancement:  Exposed the validate functionality via the Command-Line.  See: http://blog.reeset.net/archives/1141
  • Enhancement:  RDA Helper:  Updated the Abbreviation Replacement algorithm to make it a bit more flexible to deal with the myriad of real-world examples.
  • Enhancement:  RDA Helper:  The abbreviations list is configurable, and an edit link has been added to the RDA Helper to facilitate this process.
  • Bug Fix:  Merge Records:  When processing data, the tool reads identifiers and normalizes them to make it easier when working with ISBN data.  However, there are times when this normalization can cause some issues.  I’ve made a few changes to the process that should prevent these kinds of issues from occurring.

Updates can be downloaded through the automatic updater, or if you need to download the updates directly, from:

–TR

 Posted by at 7:51 pm
Dec 212012
 

One of the tools that MarcEdit has built-in into it is a light-weight validator.  For a long time, there has been an Easter Egg of sorts that allowed folks to access the validator via the command-line, if they knew the syntax.  It was something that I primarily used for my own purposes and hadn’t expected anyone else to be interested in it.  But do to a request, I’ve formalized the syntax and will expose this functionality on the next update.  In MarcEdit 5.9, you will be able to access the Validator via the command-line using the following syntax:

C:\Users\reeset>%MARCEDIT%\cmarcedit -validate -s c:\users\reeset\desktop\output.mrk –d c:\users\reeset\desktop\report.txt -rules %APPDATA%\marcedit\configs\marcrules.txt

The important elements of the argument are as following:

  • %MARCEDIT% – this is an environmental variable that I’ve configured on my own machine.  It stands in for the path to the cmarcedit program.  I highly recommend creating your own local environmental variable if you will be using the command-line tool
  • -validate:  This is the switch that tells the command-line tool that you will be invoking the validator.
  • -s:  File to validate.  Can be either in MARC or mnemonic formats.
  • -d:  Path to where you want the Validation report to be saved.
  • -rules: Path to the Rules file.  The default rules file is found in the user apps directory.  You can use %APPDATA% to represent that value – this is a Windows Environmental value that stands in as an alias for the actual path.

That’s it – internally, the command-line tool will ensure that your file paths exist, and as long as they do, it will try to run the process and save the output to your specified destination.

Questions?  Let me know.

–tr

 Posted by at 5:05 pm
Dec 212012
 

In early December 2012, I added a new API to the MarcEdit API that allowed a single Z39.50 Search to provide the results back as an array of Objects (rather than just a big string dump).  You can read more about that at: http://blog.reeset.net/archives/1133.  I’ve now added the compliment function for the BatchZ39.50 function: BatchZ3950SearchEx.  This function accepts an array of search terms, and will return a multi-dimensional array of results.  You can see an example of the use below.

Dim lobj_Z3950 


Set lobj_Z3950 =createObject("MARCEngine5.Query")
With lobj_Z3950
.Database = "VOYAGER"
.Host = "z3950.loc.gov"
.Port = 7090
.Syntax = "MARC21"
.Start = 0
.Limit = 5
End With

Dim lstring

Dim search_string(2)

search_string(0) = "digital"
search_string(1) = "libraries"

lstring = lobj_Z3950.BatchZ3950SearchEx(search_string, 4)
Set lobj_Z3950 = Nothing

msgbox cstr(Ubound(lstring,1))
for x=0 to Ubound(lstring,1)
   for y=0 to Ubound(lstring(x,1))
      msgbox "Search: " & lstring(x,0) & vbcrlf & lstring(x,1)(y)
   next
next

 

–tr

 Posted by at 1:32 pm
Dec 052012
 

Every now and again, I get messages on the list noting that when trying to translate Excel files with longer fields or with numerical data – MarcEdit’s extracted data doesn’t always reflect what’s in the file.  It can be confusing, and understanding the why of it all probably isn’t something folks want/need to understand – but I’ll try to explain what is happening here and then what I’ve been working on to try and make this go away.

So why does this sometimes happy?  Well, to understand why it happens, you have to understand a little bit about how MarcEdit views Excel.  MarcEdit doesn’t actually know anything about Excel (or Access for that point), but rather interacts with them abstractly using Microsoft’s built-in JET engine.  This is an OBDC layer that essentially allows applications to interact with these file types as databases.  This definitely simplifies the process of working with these data files – but it introduces some quirks as well.  When MarcEdit converses with an Excel file via the OBDC layer – it knows nothing about the Excel file.  As you might or might not be aware – Excel has the ability to type columns or cells.  By default, Excel attempts to type your data based on the contents of that data.  Sometimes it gets it right, sometime wrong – often times, it just leaves it marked as general.  Well, MarcEdit interacts with this data abstractly through a generic late-bound object – so it doesn’t know what the field data type is for any data in the database – so it relies on the OBDC connection to negotiate that with Excel, and sometime it negotiates this data type incorrectly (or at least, falls back to primitive data types rather than more complex types like TEXT).  This will occur most frequently in two specific cases.

  1. You have a long data field.  If the data field is marked as type General – the OBDC connection will default to treating that data as a VARCHAR.  That means MarcEdit can only receive 255 characters from the field.  This problem goes away if the data is typed as TEXT.
  2. You have alphanumeric data like ISBN that Excel wants to treat as scientific notation (or some-sort numeric).  When extracting data – the OBDC connection will mistype the information and often times return the data in a scientific notation or numeric notation.  Again, the problem goes away if the data is specifically typed.

Obviously, this is confusing when it occurs (though I’m not actually sure how often people encounter this problem since it isn’t reported often).  Since the workaround is a relatively simple one (i.e., changing the cell/column formatting), I haven’t spent too many cycles looking into simplifying this process.  However, with a few large enhancements completed, I’m going to be dedicating some cycles to investigate automated field typing within the Jet engine to see if I can come up with a solution that will virtual eliminate the problem.  The tricky part, of course, is making sure that any changes to the OBDC code-base doesn’t introduce regressions, but that’s just a normal part of development.

At this point in time, I’m looking to continue what has become a somewhat annual tradition of releasing a MarcEdit update right around Christmas, so I’m tentatively working to have a solution to this issue as part of this release.

–TR

 Posted by at 1:59 pm
Dec 052012
 

Sorry this one came in late – I had planned on posting this one on Sunday but wasn’t able to get it finished due to an error in the build process.  However, those issues have been resolved and the update has been posted. 

This update includes new feature and a few enhancements:

  1. [New Feature] Direct ILS Integration Component – This is a framework developed to support direct integration with many of the new ILS projects.  At this point, I’ve added support for Koha, but I hope to add others as I can get access to specific API.  If you need information about this new feature, please see the following:
    * MarcEdit Direct ILS Integration Setup [Koha Example]
    * MarcEdit Direct ILS Integration – Searching for data [single and batch] — [Koha Example]
    * Create records [Koha Example]
    * C# Koha API library – code repository
  2. [Enhancement] Delimited Text Translator – Auto generation arguments list – provides a format for auto generating the record creation data.  See:
    * MarcEdit 5.8 Delimited Text Translator – Auto Generate Arguments List
  3. [Enhancement] API Addition – Z3950SearchEx – allows for the return of a Z39.50 Search as an array of objects.  See:
    * MarcEdit 5.8 API Addition- Z3950SearchEx

You can download the latest updating using MarcEdit’s built-in Automatic update tool, or by downloading the new version from:

–TR

 Posted by at 1:33 am
Dec 052012
 

MarcEdit has two COM and .NET API for providing Z39.50 searching – one API provides access to a Single Query and one a batch query: [Z3950Search & BatchZ3950Search].  Both of these options deposit returned data into a file.  However, I’ve been asked if the single search can also provide an array of objects.  COM doesn’t quite work that way when dealing with VBscript, but I have added a new API to provide an array of string objects.  This way, you can enumerate through the results, rather than having to work with data stored in a file.  The new API is called Z3950SearchEx.  Here’s an example:

Dim lobj_Z3950 


Set lobj_Z3950 =createObject("MARCEngine5.Query")
With lobj_Z3950
.Database = "VOYAGER"
.Host = "z3950.loc.gov"
.Port = 7090
.Syntax = "MARC21"
.Start = 0
.Limit = 5
End With

Dim lstring
lstring = lobj_Z3950.Z3950SearchEx("digital", 4)
Set lobj_Z3950 = Nothing

for x=0 to Ubound(lstring)
   msgbox lstring(x)
next

–TR

 Posted by at 1:05 am