Trip report from Recent Innovations in Search BayCHI event

The BayCHI panel on Search innovation was a huge success. I’ve never seen the main auditorium at PARC so full…the aisles were full of people sitting on the floor (don’t tell the fire marshal) and 30 or so people had to watch remotely from a television in the lobby.
I’ll start by summarizing the key themes that came out in the panel discussion / questions and answer session. I’ll follow up with a blow-by-blow that captures some of the specifics of the show-and-tell that each company was allowed to do.

1)Seach is the command line of the internet. And it’s not documented.
The fact that it’s not documented (people have to guess at how to search) was presented as a negative by most panelists. It is certainly what makes the job of writing a search engine hard: you have to guess at what your users want. However, I think that search is the one area where we have software that tries to interpret what the user wants, rather than insisting that a user follow the software’s protocols. So I see this as being a feature rather than a bug, and I wish more software tried this hard to figure out what users were really trying to do.
2)Search has humans at both ends: people write content, and people read content. The search engine helps them connect. This is very different from most software, which has a person at one end and a database or chunk of software code at the other end.
3)Personalization is problematic. In particular, guesses about what results people will want based on theories about who they are will give unhelpful results. As Udi Manber said, “Just because someone’s a physician doesn’t mean they’re always searching for medical stuff”.
4)Implicit data is key to improving search. The big deal about tags is that they are an additional source of implicit data that can be mined by search engines, similar in many ways to links (which contain a “tag” in the form of the words used in the link). Making a tag is easier than making a link, so this is a further democratization of the web. Systems that mine reading behavior (e.g. track which articles you read in an RSS feed and use that to adjust the order of presented links) will provide ways of mining the most democratic form of implicit data, which is clicking on links.
5)The integration of search-engine collected photo data into search technology is slightly creepy. It reminds me of the software “Earth” that was described in Neal Stephenson’s seminal cyberpunk novel “Snow Crash”, which displayed real-time satellite data in a virtual projection of the globe.

[page 100 – upon entering his Metaverse office, Hiro discovers a tool]
There is something new: A globe about the size of a grapefruit, a perfectly detailed rendition of Planet Earth, hanging in space at arm’s length in front of his eyes. Hiro has heard about this but never seen it It is a piece of CIC software called, simply, Earth. It is the user interface that CIC uses to keep track of every bit of spatial information that it owns-all the maps, weather data, architectural plans, and satellite surveillance stuff.

Is it a good thing that A9 and Google are collecting all this image data of my community and putting it on the web for people to search? What happens when Google gets it’s own unmanned aerial vehicles?

6)Embedding the desired content in the search page itself
(rather than simply providing a link) is a new design trend, and a real win for searchers.
7)Making the text box bigger would be one way to encourage longer searches. The vertical list of search results will probably remain the dominant UI for search for the immediate future, since it has such good information density (rank is implicit in the order of the results, so most of the “ink” is information rather than decoration) and handles novice users (who click randomly) very well (most links “above the fold” are good results).
And now the blow-by-blow…
Peter Norvig started off with a whirlwind tour of the recent product releases at Google (google suggest, google on cell phones, google maps, desktop search, etc). One that was new for me: Google “question and answer” is an innovative tweak to their existing engine. If your query is a natural language question (e.g. “What is the population of India”), the first query result will be the answer to your question. This is the beginning of a trend of putting the actual information in search results, rather than links to destination pages.
Next up was Ken Norton from Yahoo!. He announced that the goal of Yahoo! was to “Enable people to find, use, share, and expand all human knowledge”. He then previewed MyYahooSearch, which allows users to save and share searches (and block certain results from searches). This basically integrates search with the Yahoo360 world while adding personalization. It seems very cool indeed! Y!Q was another feature that Ken demoed: basically the idea is embedding search for related items into the browser/page itself (by links and a toolbar).
Right now. All in all, there was more new features and products in the Yahoo! Presentation than in the Google presentation, and I left feeling that Yahoo! is definitely a peer of Google in the frenzy of innovation that is occurring around search.
Mark Fletcher, the founder of BlogLines, was up next. He demoed the “search the future” feature of bloglines, where you can subscribe to search results and view new hits on those queries (over time) in your RSS aggregator. What a great tool for competitive analysis, (for corporate types) and what a good way to keep up on the latest writing on topics you care about (for everybody else). Mark also specifically mentioned using implicit data (what people read) to filter results (an oblique reference to attention.xml, I think).
Udi Mamber, CEO of A9, was up next. He showed the A9 local search that everyone is so crazy about. This was the first presenter to really deal with UI issues. The responsiveness of the UI really blew me away: over a live internet collection it was serving up images as fast as he could grab them. There is some serious AJAX work happening at A9 in order to make web application dance like that: hats off to the anonymous JavaScript hackers who are making this all work!
Udi also told war stories about gathering the A9 local search data (homeland security can be so suspicious of unmarked vans with digital cameras strapped to the roof!). The rig A9 uses for data collection looked pretty sweet though. GPS, laptop, and digital video camera, all automatically collecting and annotating the video data as it is collected. As I said before, I’m no privacy nut, but this stuff is starting to make me a little nervous.
Rock-star usability consultant Jakob Neilson was up last, and he didn’t disapoint. He had hard data on user search behavior: the average query length (over the last 10 years) has increased from 1.3 to 2.2 (“an improvement of 100%!”). He sees users reaching 3 words within the next several years.
Jakob presented evidence that experience plays a big role in successful searching, and made the point that a search-based computing paradigm leaves behind the 40% of low-literate users (people who can read and write, but perceive both these things as being work rather than fun). This is similar in my mind to the tyranny of the command-line, which is unusable by non-experts.
Jakob also showed damning statistics on individual site search. Only 33% of searches on an individual site (Amazon,, etc) are successful (compared to 56% of search engine searches).
Jakob wound up with a video of a novice user trying to use AOL search to look for information on sinus headaches. The video was quite funny: the woman obviously had no idea of what a browser was or how it worked. A good reality check, though one wonders how that woman faired the next time she tried to search. Anyone learning something new (like what a web page is) is going to look a bit stupid if you videotape them, so this felt like a bit of a cheap shot to me.
Jakob also showed data that the largest single cause of usability problems (11% ) are caused by search. The second cause was IA (something that my company does a lot of work in), followed by readability and content.
Search is definitely hot again, and the competition between Google and Yahoo is leading to a tremendous number of new search products. It will be interesting to see how this plays out in the next few years. Everyone in the panel seems very confident that search has gotten a lot better, and will continue to improve in the years to come. But usability problems remain, and the challenge of “reading the users mind” to guess what they are looking for will always be a work in progress.