Aug 232015
 

Last week, I posted an update that included the early implementation of the Validate Headings tool.  After a week of testing, feedback and refinement, I think that the tool now functions in a way that will be helpful to users.  So, let me describe how the tool works and what you can expect when the tool is run.

Background:

The Validate Headings tool was added as a new report to the MarcEditor to enable users to take a set of records and get back a report detailing how many records had corresponding Library of Congress authority headings.  The tool was designed to validate data in the 1xx, 6xx, and 7xx fields.  The tool has been set to only query headings and subjects that utilize the LC authorities.  At some point, I’ll look to expand to other vocabularies.

How does it work

Presently, this tool must be run from within the MarcEditor – though at some point in the future, I’ll extract this out of the MarcEditor, and provide a stand alone function and a integration with the command line tool.  Right now, to use the function, you open the MarcEditor and select the Reports/Validate Headings menu.

image

Selecting this option will open the following window:

image

Options – you’ll notice 3 options available to you.  The tool allows users to decide what values that they would like to have validated.  They can select names (1xx, 600,10,11, 7xx) or subjects (6xx).  Please note, when you select names, the tool does look up the 600,610,611 as part of the process because the validation of these subjects occurs within the name authority file.  The last option deals with the local cache.  As MarcEdit pulls data from the Library of Congress – it caches the data that it receives so that it can use it on subsequent headings validation checked.  The cache will be used until it expires in 30 days…however, a user at any time can check this option and MarcEdit will delete the existing cache and rebuild it during the current data run. 

Couple things you’ll also note on this screen. There is an extract button and it’s not enabled.  Once the Validate report is run, this button will become enabled if there are any records that are identified as having headings that could not be validated against the service. 

Running the Tool:

Couple notes about running the tool.  When you run the tool, what you are asking MarcEdit to do is process your data file and query the Library of Congress for information related to the authorized terms in your records.  As part of this process, MarcEdit sends a lot of data back and forth to the Library of Congress utilizing the http://id.loc.gov service.  The tool attempts to use a light touch, only pulling down headings for a specific request – but do realize that a lot of data requests are generated through this function.  You can estimate approximately how many requests might be made on a specific file by using the following formula: (number of records x 2)  + (number of records), assuming that most records will have 1 name to authorize and 1 subjects per record.  So a file with 2500 records would generate ~7500 requests to the Library of Congress.  Now, this is just a guess, in my tests, I’ve had some sets generate as many as 12,000 requests for 2500 records and as few as 4000 requests for 2500 records – but 7500 tended to be within 500 requests in most test files.

So why do we care?  Well, this report has the potential to generate a lot of requests to the Library of Congress’s identifier service – and while I’ve been told that there shouldn’t be any issues with this – I think that question won’t really be known until people start using it.  At the same time – this function won’t come as a surprise to the folks at the Library of Congress – as we’ve spoken a number of times during the development.  At this point, we are all kind of waiting to see how popular this function might be, and if MarcEdit usage will create any noticeable up-tick in the service usage.

Validation Results:

When you run the validation tool, the program will go through each record, making the necessary validation requests of the LC ID service.  When the service has completed, the user will receive a report with the following information:

Validation Results:
Process completed in: 121.546001431667 minutes. 
Average Response Time from LC: 0.847667984420415
Total Records: 2500
Records with Invalid Headings: 1464
**************************************************************
1xx Headings Found: 1403
6xx Headings Found: 4106
7xx Headings Found: 1434
**************************************************************
1xx Headings Not Found: 521
6xx Headings Not Found: 1538
7xx Headings Not Found: 624
**************************************************************
1xx Variants Found: 6
6xx Variants Found: 1
7xx Variants Found: 3
**************************************************************
Total Unique Headings Queried: 8604
Found in Local Cache: 1001
***************************************************************

This represents the header of the report.  I wanted users to be able to quickly, at a glance, see what the Validator determined during the course of the process.  From here, I can see a couple of things:

  1. The tool queried a total of 2500 records
  2. Of those 2500 records, 1464 of those records had a least one heading that was not found
  3. Within those 2500 records, 8604 unique headers were queried
  4. Within those 2500 records, there were 1001 duplicate headings across records (these were not duplicate headings within the same record, but for example, multiple records with the same author, subject, etc.)
  5. We can see how many Headings were found by the LC ID service within the 1xx, 6xx, and 7xx blocks
  6. Likewise, we can see how many headings were not found by the LC ID service within the 1xx, 6xx, and 7xx blocks.
  7. We can see number of Variants as well.  Variants are defined as names that resolved, but that the preferred name returned by the Library of Congress didn’t match what was in the record.  Variants will be extracted as part of the records that need further evaluation.

After this summary of information, the Validation report returns information related to the record # (record number count starts at zero) and the headings that were not found.  For example:

Record #0
Heading not found for: Performing arts--Management--Congresses
Heading not found for: Crawford, Robert W

Record #5
Heading not found for: Social service--Teamwork--Great Britain

Record #7
Heading not found for: Morris, A. J

Record #9
Heading not found for: Sambul, Nathan J

Record #13
Heading not found for: Opera--Social aspects--United States
Heading not found for: Opera--Production and direction--United States

The current report format includes specific information about the heading that was not found.  If the value is a variant, it shows up in the report as:

Record #612
Term in Record: bible.--criticism, interpretation, etc., jewish
LC Preferred Term: Bible. Old Testament--Criticism, interpretation, etc., Jewish
URL: http://id.loc.gov/authorities/subjects/sh85013771
Heading not found for: Bible.--Criticism, interpretation, etc

Here you see – the report returns the record number, the normalized form of the term as queried, the current LC Preferred term, and the URL to the term that’s been found.

The report can be copied and placed into a different program for viewing or can be printed (see buttons).

image

To extract the records that need work, minimize or close this window and go back to the Validate Headings Window.  You will now see two new options:

image

First, you’ll see that the Extract button has been enabled.  Click this button, and all the records that have been identified as having headings in need of work will be exported to the MarcEditor.  You can now save this file and work on the records. 

Second, you’ll see the new link – save delimited.  Click on this link, and the program will save a tab delimited copy of the validation report.  The report will have the following format:

Record ID [tab] 1xx [tab] 6xx [tab] 7xx [new line]

Each column will be delimited by a colon, so if two 1xx headings appear in a record, the current process would create a single column, but with the headings separated by a colon like: heading 1:heading 2. 

Future Work:

This function required making a number of improvements to the linked data components – and because of that, the linking tool should work better and faster now.  Additionally, because of the variant work I’ve done, I’ll soon be adding code that will give the user the option to update headings for Variants as is report or the linking tool is running – and I think that is pretty cool.  If you have other ideas or find that this is missing a key piece of functionality – let me know.

–tr

 Posted by at 7:16 pm
Aug 092015
 

Over the last year, I’ve spent a good deal of time looking for ways to integrate many of the growing linked data services into MarcEdit.  These services, mainly revolving around vocabularies, provide some interesting opportunities for augmenting our existing MARC data, or enhancing local systems that make use of these particular vocabularies.  Examples like those at the Bentley (http://archival-integration.blogspot.com/2015/07/order-from-chaos-reconciling-local-data.html) are real-world demonstrations of how computers can take advantage of these endpoints when they are available.

In MarcEdit, I’ve been creating and testing linking tools for close to a year now, and one of the areas I’ve been waiting to explore is whether libraries can utilize linking services to build their own authorities workflows.  Conceptually, it should be possible – the necessary information exists…it’s really just a matter of putting it together.  So, that’s what I’ve been working on.  Utilizing the linked data libraries found within MarcEdit, I’ve been working to create a service that will help users identify invalid headings and records where those headings reside.

Working Wireframes

Over the last week, I’ve prototyped this service.  The way that it works is pretty straightforward.  The tool extracts the data from the 1xx, 6xx, and 7xx fields, and if they are tagged as being LC controlled, I query the id.loc.gov service to see what information I can learn about the heading.  Additionally, since this tool is designed for work in batch, there is a high likelihood that headings will repeat – so MarcEdit is generating a local cache of headings as well – this way it can check against the local cache rather than the remote cache when possible.  The local cache will constantly be grown – with materials set to expire after a month.  I’m still toying with what to do with the local cache, expirations, and what the best way to keep it in sync might be.  I’d originally considered pulling down the entire LC names and subjects headings – but for a desktop application, this didn’t make sense.  Together, these files, uncompressed, consumed GBs of data.  Within an indexed database, this would continue to be true.  And again, this file would need to be updated regularly.  To, I’m looking for an approach that will give some local caching, without the need to make the user download and managed huge data files.

Anyway – the function is being implemented as a Report.  Within the Reports menu in the MarcEditor, you will eventually find a new item titled Validate Headings.

image

When you run the Validate Headings tool, you will see the following window:

image

You’ll notice that there is a Source file.  If you come from the MarcEditor, this will be prepopulated.  If you come from outside the MarcEditor, you will need to define the file that is being processed.  Next, you select the elements to authorize.  Then Click Process.  The Extract button will initially be enabled until after the data run.  Once completed, users can extract the records with invalid headings.

When completed, you will receive the following report:

image

This includes the total processing time, average response from LC’s id.loc.gov service, total number of records, and the information about how the data validated.  Below, the report will give you information about headings that validated, but were variants.  For example:

Record #846
Term in Record: Arnim, Bettina Brentano von, 1785-1859
LC Preferred Term: Arnim, Bettina von, 1785-1859

This would be marked as an invalid heading, because the data in the record is incorrect.  But the reporting tool will provide back the Preferred LC label so the user can then see how the data should be currently structured.  Actually, now that I’m thinking about it – I’ll likely include one more value – the URI to the dataset so you can actually go to the authority file page, from this report.

This report can be copied or printed – and as I noted, when this process is finished, the Extract button is enabled so the user can extract the data from the source records for processing.

Couple of Notes

So, this process takes time to run – there just isn’t any way around it.  For this set, there were 7702 unique items queried.  Each request from LC averaged 0.28 seconds.  In my testing, depending on the time of day, I’ve found that response rate can run between 0.20 seconds per request to 1.2 seconds per response.  None of those times are that bad when done individually, but when taken in aggregate against 7700 queries – it adds up.  If you do the math, 7702*0.2 = 1540 seconds to just ask for the data.  Divide that by 60 and you get 25.6 minutes.  The total time to process that means that there are 11 minutes of “other” things happening here.  My guess, that other 11 minutes is being eaten up by local lookups, character conversions (since LC request UTF8 and my data was in MARC8) and data normalization.  Since there isn’t anything I can do about the latency between the user and the LC site – I’ll be working over the next week to try and remove as much local processing time from the equation as possible.

Questions – let me know.

–tr

 Posted by at 7:44 am
Aug 022015
 

MarcEdit Mac users, a new preview update has been made available.  This is getting pretty close to the first “official” version of the Mac version.  And for those that may have forgotten, the preview designation will be removed on Sept. 1, 2015.

So what’s been done since the last update?  Well, I’ve pretty much completed the last of the work that was scheduled for the first official release.  At this point, I’ve completed all the planned work on the MARC Tools and the MarcEditor functions.  For this release, I’ve completed the following:

****************************
** 1.0.9 ChangeLog
****************************

  • Bug Fix: Opening Files — you cannot select any files but a .mrc extension. I’ve changed this so the open dialog can open multiple file types.
  • Bug Fix: MarcEditor — when resizing the form, the filename in the status can disappear.
  • Bug Fix: MarcEditor — when resizing, the # of records per page moves off the screen.
  • Enhancement: Linked Data Records — Tool provides the ability to embed URI endpoints to the end of 1xx, 6xx, and 7xx fields.
  • Enhancement: Linked Data Records — Tool has been added to the Task Manager.
  • Enhancement: Generate Control Numbers — globally generates control numbers.
  • Enhancement: Generate Call Numbers/Fast Headings – globally generated call numbers/fast headings for selected records.
  • Enhancement: Edit Shortcuts — added back the tool to enabled Record Marking via a comment.

Over the next month, I’ll be working on trying to complete four other components prior to the first “official” release Sept. 1.  This means that I’m anticipating at least 1, maybe 2 more large preview releases before Sept. 1, 2015.  The four items I’ll be targeting for completion will be:

  1. Export Tab Delimited Records Feature — this feature allows users to take MARC data and create delimited files (often for reporting or loading into a tool like Excel).
  2. Delimited Text Translator — this feature allows users to generate MARC records from a delimited file.  The Mac version will not, at least initially, be able to work with Excel or Access data.  The tool will be limited to working with delimited data.
  3. Update Preferences windows to expose MarcEditor preferences
  4. OCLC Metadata Framework integration…specifically, I’d like to re-integrate the holdings work and the batch record download.

How do you get the preview?  If you have the current preview installed, just open the program and as long as you have the notifications turned on – the program will notify that an update is available.  Download the update, and install the new version.  If you don’t have the preview installed, just go to: http://marcedit.reeset.net/downloads and select the Mac app download.

If you have any questions, let me know.

–tr

 Posted by at 4:42 pm
Jul 292015
 

I hadn’t planned on putting together an update for the Windows version of MarcEdit this week, but I’ve been working with someone putting the Linked Data tools through their paces and came across instances where some of the linked data services were not sending back valid XML data – and I wasn’t validating it.  So, I took some time and added some validation.  However, because the users are processing over a million items through the linked data tool, I also wanted to provide a more user friendly option that doesn’t require opening the MarcEditor – so I’ve added the linked data tools to the command line version of MarcEdit as well. 

Linked Data Command Line Options:

The command line tool is probably one of those under-used and unknown parts of MarcEdit.  The tool is a shim over the code libraries – exposing functionality from the command line, and making it easy to integrate with scripts written for automation purposes.  The tool has a wide range of options available to it – and for users unfamiliar with the command line tool – they can get information about the functionality offered by querying help.  For those using the command line tool – you’ll likely want to create an environmental variable pointing to the MarcEdit application directory so that you can call the program without needing to navigate to the directory.  For example, on my computer, I have an environmental variable called: %MARCEDIT_PATH% which points to the MarcEdit app directory.  This means that if I wanted to run the help from my command line for the MarcEdit Command Line tool, I’d run the following and get the following results:

C:\Users\reese.2179>%MARCEDIT_PATH%\cmarcedit -help
***************************************************************
* MarcEdit 6.1 Console Application
* By Terry Reese
* email: reeset@gmail.com
* Modified: 2015/7/29
***************************************************************
Arguments:
        -s:     Path to file to be processed.
                        If calling the join utility, source must be files
                        delimited by the ";" character
        -d:     Path to destination file.
                          If call the split utility, dest should specify a fold
r
                        where split files will be saved.
                        If this folder doesn't exist, one will be created.
        -rules: Rules file for the MARC Validator.
        -mxslt: Path to the MARCXML XSLT file.
        -xslt:  Path to the XML XSLT file.
        -batch: Specifies Batch Processing Mode
        -character:     Specifies character conversion mode.
        -break: Specifies MarcBreaker algorithm
        -make:  Specifies MarcMaker algorithm
        -marcxml:       Specifies MARCXML algorithm
        -xmlmarc:       Specifics the MARCXML to MARC algorithm
        -marctoxml:     Specifies MARC to XML algorithm
        -xmltomarc:     Specifies XML to MARC algorithm
        -xml:   Specifies the XML to XML algorithm
        -validate:      Specifies the MARCValidator algorithm
        -join:  Specifies join MARC File algorithm
        -split: Specifies split MARC File algorithm
        -records:       Specifies number of records per file [used with split c
mmand].
        -raw:   [Optional] Turns of mnemonic processing (returns raw data)
        -utf8:  [Optional] Turns on UTF-8 processing
        -marc8: [Optional] Turns on MARC-8 processing
        -pd:    [Optional] When a Malformed record is encountered, it will modi
y the process from a stop process to one where an error is simply noted and a s
ub note is added to the result file.
        -buildlinks:    Specifies the Semantic Linking algorithm
This function needs to be paired with the -options parameter
        -options        Specifies linking options to use: example: lcid,viaf:lc
oclcworkid,autodetect           lcid: utilizes id.loc.gov to link 1xx/7xx data
                autodetect: autodetects subjects and links to know values
                oclcworkid: inserts link to oclc work id if present
                viaf: linking 1xx/7xx using viaf.  Specify index after colon. I
 no index is provided, lc is assumed.
                        VIAF Index Values:
                        all -- all of viaf
                        nla -- Australia's national index
                        vlacc -- Belgium's Flemish file
                        lac -- Canadian national file
                        bnc -- Catalunya
                        nsk -- Croatia
                        nkc -- Czech.
                        dbc -- Denmark (dbc)
                        egaxa -- Egypt
                        bnf -- France (BNF)
                        sudoc -- France (SUDOC)
                        dnb -- Germany
                        jpg -- Getty (ULAN)
                        bnc+bne -- Hispanica
                        nszl -- Hungary
                        isni -- ISNI
                        ndl -- Japan (NDL)
                        nli -- Israel
                        iccu -- Italy
                        LNB -- Latvia
                        LNL -- Lebannon
                        lc -- LC (NACO)
                        nta -- Netherlands
                        bibsys -- Norway
                        perseus -- Perseus
                        nlp -- Polish National Library
                        nukat -- Poland (Nukat)
                        ptbnp -- Portugal
                        nlb -- Singapore
                        bne -- Spain
                        selibr -- Sweden
                        swnl -- Swiss National Library
                        srp -- Syriac
                        rero -- Swiss RERO
                        rsl -- Russian
                        bav -- Vatican
                        wkp -- Wikipedia

        -help:  Returns usage information

The linked data option uses the following pattern: cmarcedit.exe –s [sourcefile] –d [destfile] –buildlinks –options [linkoptions]

As noted above in the list, –options is a comma delimited list that includes the values that the linking tool should query.  A user, for example, looking to generate workids and uris on the 1xx and 7xx fields using id.loc.gov – the command would look like:

<< cmarcedit.exe –s [sourcefile] –d [destfile] –buildlinks –options oclcworkid,lcid

Users interesting in building all available linkages (using viaf, autodetecting subjects, etc. would use:

<< cmarcedit.exe –s [sourcefile] –d [destfile] –buildlinks –options oclcworkid,lcid,autodetect,viaf:lc

Notice the last option – viaf. This tells the tool to utilize viaf as a linking option in the 1xx and the 7xx – the data after the colon identifies the index to utilize when building links.  The indexes are found in the help (see above).

Download information:

The update can be found on the downloads page: http://marcedit.reeset.net/downloads or using the automated update tool within MarcEdit.  Direct links:

Mac Port Update:

Part of the reason I hadn’t planned on doing a Windows update of MarcEdit this week is that I’ve been heads down making changes to the Mac Port.  I’ve gotten good feedback from folks letting me know that so far, so good.  Over the past few weeks, I’ve been integrating missing features from the MarcEditor into the Port, as well as working on the Delimited Text Translation.  I’ll now have to go back and make a couple of changes to support some of the update work in the Linked Data tool – but I’m hoping that by Aug. 2nd, I’ll have a new Mac Port Preview that will be pretty close to completing (and expanding) the initial port sprint. 

Questions, let me know.

–tr

 Posted by at 9:39 pm
Jul 212015
 

With the last update, I made a few significant modifications to the Merge Records tool, and I wanted to provide a bit more information around how these changes may or may not affect users.  The changes can be broken down into two groups:

  1. User Defined Merge Field Support
  2. Multiple Record merge support

Prior to MarcEdit 6.1, the merge records tool utilized 4 different algorithms for doing record merges.  These were broken down by field class, and as such, had specific functionality built around them since the limited scope of the data being evaluated, made it possible.  Two of these specific functions was the ability for users to change the value in a field group class (say, change control numbers from 001 to 907$b) and the ability for the tool to merge multiple records in a merge file, into the source.

When I made the update to 6.1, I tossed out the 3 field specific algorithms, and standardized on a single processing algorithm – what I call the MARC21 option.  This is an algorithm that processes data from a wide range of fields, and provides a high level of data evaluation – but in doing this, I set the fields that could be evaluated, and the function dropped the ability to merge multiple records into a single source file.  The effect of this was that:

  • Users could no longer change the fields/subfields used to evaluate data for merge outside of those fields set as part of the MARC21 option.
  • if a user had a file that looked like the following —
    sourcefile1 – record 1
    mergefile – record1 (matches source1)
    mergefile – record2
    mergefile – record3 (matches source1)

    Only data from the mergefile – record 1 would be merged.  The tool didn’t see the secondary data that might be in the merge file.  This has always been the case when working with the MARC21 merge option, but by making this the only option, I removed this functionality from the program (as the 3 custom field algorithms did make accommodations for merging data from multiple records into a single source).

With the last update, I’ve brought both of these to elements back to the tool.  When a user utilizes the Merge Records tool, they can change the textbox with the field data – and enter a new field/subfield combination for matching (at this point, it must be a field/subfield combination).  Secondly, the tool now handles the merging of multiple records if those data elements are matched via a title or control number.  Since MarcEdit will treat user defined fields as the same class as a standard number (ISBN technically) for matching – users will now see that the tool can merge duplicate data into a single source file.

Questions about this – just let me know.

–tr

 Posted by at 9:06 am
Jul 202015
 

This update will have four significant changes to three specific algorithms that are high use — so I wanted to give folks a heads up.

1) Merge Records — I’ve updated the process in two ways.  

   a) Users can now change the data in the dropdown box to a user-defined field/subfield combination.  At present, you have defined options: 001, 020, 022, 035, marc21.  You will now be able to specify another field/subfield combination (must be the combination) for matching.  So say you exported your data from your ILS, and your bibliographic number is in a 907$b — you could change the textbox from 001 to 907$b and the tool will now utilize that data, in a control number context — to facilitate matching.  

   b) This meant making a secondary change.  When I shifted to using the MARC21 method, I removed the ability for the algorithm to collapse multiple records of the same type with the merge file into the source.  For example, after the change to the marc21 algorithm, in the following scenario, the following would be true:

 source 1 — record 1
merge 1 — matches record 1
merge 2 — matches record 2
merge 3 — matches record 3

 

The data moved into source 1 would be the data from merge1 — merge 3 wouldn’t be seen.  In the previous version prior to utilizing just the Marc21 option, users could collapse records when using the control number index match.  I’ve updated the merge algorithm, so that default is now to assume that all source data could have multiple merge matches.  This has the practical option of essentially allowing users to take a merge file with multiple duplicates, and merge all data into a single corresponding source file.  But this does represent a significant behavior change — so users need to be aware.

 

2) RDA Helper — 

   a) I’ve updated the error processing to ensure that the tool can fail a bit more gracefully

   b) Updating the abbreviation expansion because the expression I was using could miss values on occasion.  This will catch more content — it should also be a bit faster.

 

3) Linked Data tools — I included the ability to link to OCLC works ids — there were problems when the json outputted was too nested.  This has been corrected.

 

4) Bibframe tool — I’ve updated the mapping used to the current LC flavor.

 

Updates can be found on the downloads page (Windows/Linux) or via the automated update tool.

Direct Links:

 

 Posted by at 11:51 pm

MarcEdit OSX Public Preview 1

 MarcEdit, Uncategorized  Comments Off on MarcEdit OSX Public Preview 1
Jul 052015
 

It’s with a little trepidation that I’m formally making the first Public Preview of the MarcEdit OSX version available for download and use.  In fact, as of today, this version is now *the* OSX download available on the downloads page.  I will no longer be building the old code-base for use on OSX.

When I first started this project around Mid-April, I began knowing that this process would take some time.  I’ve been working on MarcEdit continuously for a little over 16 years.  It’s gone through one significant rewrite (when the program moved from Assembly to C#) and has had way too many revisions to count.  In agreeing to take on the porting work — I’d hoped that I could port a significant portion of the program over the course of about 8 months and that by the end of August, I could produce a version of MarcEdit that would cover the 80% or so of the commonly used application toolset.  To do this, it meant porting the MARC Tools portion of the application and the MarcEditor.

Well, I’m ahead of schedule.  Since about 2014, I’ve been reworking a good deal of the application to support a smoother porting process sometime in the future — though, honestly, I wasn’t sure that I’d ever actual do the porting work.  Pleasantly, this early work has made a good deal of the porting work easier allowing me to move faster than I’d anticipated.  As of this posting, a significant portion of that 80% has been converted, and I think that for many people — most of what they probably use daily — has been implemented.  And while I’m ahead of schedule and have been happy with how the porting process has gone, make no mistake — it’s been a lot of work, and a lot of code.  Even though this work has primarily been centered around rewriting just the UI portions of MarcEdit, you are still talking, as of today, close to 200,000 lines of code.  This doesn’t include the significant amount of work I’ve done around the general assemblies that have provided improvements to all MarcEdit users.  Because of that — I need to start getting feedback from users.  While the general assemblies go through an automated testing process — I haven’t, as of yet, come up with an automated testing process for the OSX build.  This means that I’m testing things manually, and simply cannot go through the same leveling of testing that I do each time I build the Windows version.  Most folks may not realize it, but it takes about a day to build the Windows version — as the program goes through various unit tests processing close to 25 million records.  I simply don’t have an equivalent of that process yet, so I’m hoping that everyone interested in this work will give it a spin, use it for real work, and let me know if/when things fall down.

In creating the Preview, I’ve tried to make the process for users as easy as possible.  Users interested in running the program simply need to be running at least OSX 10.8 and download the dmg found here: http://marcedit.reeset.net/downloads.  Once downloaded, run the dmg an a new disk will mount called MarcEdit OSX.  Run this file, and you’ll see the following installer:

MarcEdit OSX installer

MarcEdit OSX installer

Drag the MarcEdit icon into the Applications folder and the application will either install, or overwrite an existing version.  That’s it.  No other downloads are necessary.  On first run, the program will generate a marcedit folder under /users/[yourid]/marcedit.  I realize that this isn’t completely normal — but I need the data accessible outside of the normal app sandbox to easily support updates.  I’d also considered the User Documents folder, but the configuration data probably shouldn’t live there either.  So, this is where I ended up putting it.

So what’s been completed — Essentially, all the MARC Tools functions and a significant amount of the MarcEditor has been completed.  There are some conspicuous functions that are absent at this point though.  The Call Number and Fast Heading generation, the Delimited Text Translator and Exporter, the Select and Delete Selected Records, everything Z39.50 related, as well as the Linked Data tools and the Integration work with OCLC and Koha.  All these are not currently available — but will be worked on.  At this point, what users can do is start letting me know what absent components are impacting you the most, and I’ll see how they fit into the current development roadmap.

Anyway — that’s it.  I’m excited to let you all give this a try, and a little nervous as well.  This has been a significant undertaking which has definitely pushed me a bit, requiring me to learn Object-C in a short period of time, as well as quickly assimilate a significant portion of Apples SDK documents relating to UI design.  I’m sure I’ve missed things, but it’s time to let other folks start working with it.

If you have been interested in this work — download the installer, kick the tires, and give feedback.  Just remember to be gentle.  :)

–TR

Download URL: http://marcedit.reeset.net/downloads

 

 Posted by at 8:40 pm

MarcEdit 6.1 Update

 MarcEdit  Comments Off on MarcEdit 6.1 Update
Jul 052015
 

This was something I’d hoped to get into the last update, but didn’t get the time to test it; so I got it done now.  While at the first MarcEdit User Group meeting at ALA, there was a question about supporting 880 fields when exporting data via tab delimited format.  When you use the tool right now, the program will export all the 880 fields, not a specific 880 field.  This update changes that.  After the update, when you select the 880 field in the Export tab delimited tool, the program will ask you for the linking field.  In this case, the program will then match the 880$6[linkingfield], and pull the selected subfield.  I’m not sure how often this comes up — but it certainly made a lot of sense when the problem was described to me.

You can pick up the download at: http://marcedit.reeset.net/downloads

–tr

 Posted by at 8:33 pm
Jun 192015
 

Logistics

Time: 6:00 – 7:30 pm, Friday, June 26, 2015
Place: Marriott Marquis (map)
Room: Pacific H, capacity: 30

Description:

The MarcEdit user community is large and diverse and honestly, I get to meet far too few community members.  This meeting has been put together to give members of the community a chance to come together and talk about the development road map, hear about the work to port MarcEdit to the Mac, and give me an opportunity to hear from the community.  I’ll talk about future work, areas of potential partnership, as well as hearing from you what you’d like to see in the program to make your metadata live’s a little easier.  If this sounds interesting to you — I really hope to see you there.

Acknowledgements:

A *big* thank you to John Chapman and OCLC for allowing this to happen.  As folks might guess, finding space at ALA can be a challenging and expensive endeavor so when I originally broached the idea with OCLC, I had pretty low expectations.  But they truly went above and beyond any reasonable expectation, working with the hotel and ALA so this meeting could take place.  And why they didn’t ask for it — they have my personal thanks and gratitude.  If you can attend the event, or heck, wish you could have but your schedule made it impossible — make sure you let OCLC know that this was appreciated.

 Posted by at 1:24 pm

MarcEdit Mac Port Update

 MarcEdit  Comments Off on MarcEdit Mac Port Update
Jun 062015
 

Having made one preview available for evaluation and feedback, I’ve been diligently working on updating the tool and working on new functionality.  This includes moving onto developing a new notification service so that preview users know when new builds are available and providing a new preferences window to enable support for changing applicable preferences currently exposed within the application.  From the perspective of new work — I’ve begun working on the MarcEditor.  At this point, I’ve mocked out the window and am starting to create the global editing toolsets and connecting actions to the various UI elements.  This is going to be a bit of a time consuming process — but thankfully, it’s been made somewhat easier by the fact that much of the code within the MarcEditor that’s not platform specific, has been moved outside the application to re-usable assemblies.  There’s a bit more refactoring that needs to be done, and I’ll need to re-think how the program streams data into the MarcEditor edit window since the OSX apis make this a bit difficult — but it’s getting there.  Below — you will find screenshots of some of the new work?

–tr

Mac Port Notification Example

MarcEdit Mac Port Notifications Window Example

MarcEdit Mac Preferences: MARCEngine

MarcEdit Mac Preferences window: Current preferences exposed for update are for the MARCEngine and the Automatic Update notification.

This is the initial MarcEditor Wireframe for the Mac.  This feels pretty solid at this point.

This is the initial MarcEditor Wireframe for the Mac. This feels pretty solid at this point.

Current options in the MarcEditor scheduled for the first release

Current options in the MarcEditor scheduled for the first release

MarcEdit Mac MarcEditor Edit Menu wireframe -- options targeted for the first release

MarcEdit Mac MarcEditor Edit Menu wireframe — options targeted for the first release

MarcEdit Mac MarcEditor Reports Menu wireframe showing functions targeted for the first release

MarcEdit Mac MarcEditor Reports Menu wireframe showing functions targeted for the first release

MarcEdit Mac MarcEditor Tools Menu wireframe showing functions targeted for the first release

MarcEdit Mac MarcEditor Tools Menu wireframe showing functions targeted for the first release