Posted by: char booth | 6 November 2007

third-generation search engines

It’s been out for a week now but I just got around to reading Newsweek’s “Searching for the Best Engine” article, which compares efforts by various companies to shave  Google’s market share by improving on their “second-generation” search model (first generation = query matching, second generation = weighted link-frequency ranking). Although much of it reads like old news, among the things that stood out to me was the revelation that Google does not dominate searching internationally in the way that it does in North America and western Europe – many nations are pouring development funds into local search improvement efforts that are trying to integrate the following “third generation” characteristics:

1 . Word Smarts – more topically intuitive engines able to “go beyond matching your exact query”

2. Editing – aka indexing, involving humans to some extent in determining relevancy.

3. Focus – users are no longer impressed by 300,000,000 results, so why display them?

4. Guided Queries – better term suggestion mechanisms than simple spell-check.

5. Community – integrating user ranking and content into search results.

I’m interested to watch Google’s search shelf life over the next five years or so. Despite diversifying to an astonishing degree and incorporating some innovations in the way search results are sorted, they have left their one-box usability model essentially the same for close to a decade. Although they will doubtlessly be given a run for their money by any number of competing technologies, I assume Google will retain their position of primacy for quite some time. I’m most curious to see if they will eventually adapt their simplicity-trumps-all model to incorporate more features on the user side.

Earlier this year I joined what I’m thinking was a rather large group of librarian beta testers for Yahoo’s recently debuted “point and click query refinement capability,” Search Assist. The testing consisted of completing daily search tasks and reporting on one’s experience with the technology, and was the type of thing where you receive prizes in the mail for actually doing what they ask you to do. It was a fairly interesting process and somewhat validating from the someone-corporate-knows-what-librarians- are-good-for angle, and I must say that they cleaned up the functionality of the search quite a bit.

One irritation I have is that Search Assist only begins to suggest terms after the third letter of a query, which makes information processing sense but frustrates in the acronym-parsing capacity:

it.png

vs.

its.png

Search assist also only works by default on a web search, and not for images, videos, etc. I’ve really never been a fan of Yahoo overall, but in terms of the experience gaining insight into how these tools are tested and marketed was priceless.

On a slightly related tangent, PHP/MySQL driven products such as MediaWiki and a variety of dynamic content sites and web forums also suffer from a worse version of the two-letter searches don’t count malady – try searching for ‘ALA’ or any other three-letter acronym in ReadWriteConnect, the main ALA wiki, and you’ll see what I mean. The OU Libraries Systems department was able to create a three-letter search fix for our Knowledgebase Publisher-powered Library FAQs site, which vexed us for a while by returning no results for “DVD”.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: