Making metasearch work for you

By reeset / On / In Digital Libraries, Programming

Maybe I’ll write about this — maybe not.  I haven’t decided yet — but one of the benefits of rolling our own metasearch tool is that we get to try lots of new things.  Case in point.  Last night (this morning), I setup a multi-tier caching system into our metasearch tool.  Here’s how it works:

  1. level 1 — in memory cache.  This covers a users current session’d data.  We use this information to make decisions based on filtering, deduping and eventually allow users to “save” a session — no matter how many searches or results that might entail.
  2. level 2 — daily cache.  This is a semi-permenant cache that exists for 2 days.  Basically, every user query is cached and attached to its results.  This way, any user running like queries will get the same results (without having to re-run the metaquery).  It gives the impression that the tool is faster than it really is — plus it allows patrons to essentially (though unknowingly) collaborate with each other since two users querying like topics would be presented with the same results.
  3. level 3 — permenant caching.  Since we are cachine queries — we keep them for 2 days and during that period, analyse the log for the top 15% of queries and cache those results nightly.  As topics change, so does this cache — but ideally, it will represent the current research interests of our students and faculity allowing for quick retrieval of heavily used resources.

So what needs to be added.  Well, natural language recognition so that queries can be analyized in terms of concepts rather than text matching.  And that would be cool.  Two users could search on the same concept using different search terms — yet the system still knows to pull for them the same cached results.  We have some CS folks here doing some great work in this area…I think its about time to pay them a visit. 🙂

–Terry