I’ve been continuing to think about new workflows that could be added to the Watcher service. One of the most requested is the ability for MarcEdit’s tool to not only watch local files, but to be able to monitor remote files hosted on FTP/SFTP servers as well. So, over the past couple of weeks, I’ve been fleshing out what this might look like, and writing the components necessary to include the FTP/SFTP functionality into the application.
At this point, I have a working model that I’m connecting into the program — and will be making available with the next update. The process, as I envision it now, will look something like this:
1. Users now select from a Local or Remote Folder to Watch
2. If FTP or SFTP is selected, you will be prompted for log info:
3. After config info is provided, the tool will log into the server at the host. This is a folder browse. You can view folders and browse subfolders.
4. Click Select folder (when you are in or have selected the folder to process) and it will be pulled into the watcher profile
Once the folder has been specified, the tool will now watch this FTP/SFTP resource. This also means thinking about adding more granularity to the watch scheduling. Currently, schedules are set per 24 hour period. I’m thinking that for most vendor/FTP/SFTP resources, this many be too often. To work around that right now, the tool creates a hash file of all data downloaded, and will check with the server to determine if the hash has changed prior to downloading the data. Ideally, that means that the tool will only download records for processing when there is a change at the server. That will probably work for now — but I am aware that it would likely be better to have schedules that can be planned to run at a specific point during a month or week.
What is exciting for me as well, is that this work allows me to think about other potential workflows and how people bring data into the MarcEditor. So, if you currently have to go somewhere to download MARC data from an FTP/SFTP server — I’d be interested in hearing how I could make this easier for you.
In addition to these changes, I’m looking to add some “global” watcher elements. Things like MARCXML to MARC conversion, Joining resulting files into a single data sets, and character conversion options…though I’m thinking through how these work and the order of operations as some of these (character conversion) would be run on an individual file (like a task) with others (Join) would need to run after all files in a watched folder were processed. So — how order is determined and processed would be important.
If you have questions, please feel free to let me know.
–tr