Last week, I discussed some of the work I was doing to continue to evaluate how Task processing will work in MarcEdit 7. To do some of this work, I’ve been working with a set of outlier data who’s performance in MarcEdit 6.3 left much to be desired. You can read about the testing and the file set here: MarcEdit 7: Continued Task Refinements
Over the week, I’ve continued to work on how this data is processed, hoping to continue to move the processing time of this data from almost 7 hours in MarcEdit 6.3 to around 1 1/2 hours, and I’ve been able to do that and more. My guess was that by adding targeted pre-processing statements into the task processing queue, I could improve processing by only running the task processes that absolutely had to be run. In this case, I had 962 task actions, but on any given record, maybe 20-30 needed to be run. By adding a preprocessing step, I was able to move the processing time from 2+hours to 25 minutes. My guess is that I’ve reached the ceiling in terms of optimizations, but I can live with this. Of course, over the next few days, what I’ll need to do is validate that these new changes don’t cause the program to miss processing a step that should be run. Generally, I’ve setup the preprocessing steps so that it will fall back to running the task when in doubt.