How I'd Build a Search Engine

Over the past year, we’ve seen no fewer than three major attempts to compete with Google come to market, with mixed results. Cuil’s disastrous launch last Summer all but killed their chances of success. Wolfram is in murky waters trying to remain distinctly not a search engine. And Microsoft is presently trying to buy their way onto the scene with Bing. I think there’s little-to-no debate that we should not let our entire worldview and information dissemination be in the hands of a single company (a recent study showed over 70% US market share – that’s just plain dangerous).

But how does one out-Google Google? I’ll start with the “don’t”s, the things that should be avoided.

Bigger ain’t better. Nobody, and I mean nobody, cares that your search index is 17.37835% deeper than Google’s. Search is at worst a semi-solved problem, which means the average person looking for something online probably perceives Google will have it.
Don’t confuse simple with simplistic. If you didn’t know, Google’s design is as simplistic as it is because they didn’t really know HTML when they started! For some reason, this has caused legions of lookalikes, despite plenty of better examples of usability design.
Overhype can be bad. It’s a tricky challenge, but to even play in this space it is essential to set, then meet or exceed expectations. Wolfram is doing this fairly well, Cuil did it badly.
Don’t enable spam. If there’s one evil Google is blatantly doing, it’s enabling spam. Spam blogs. Spam news sites. Spam landing pages. The sheer ability to create and profit from a hosted domain with nothing but uselessness is a horror Google has unleashed upon us, and is unfortunately financially motivated to continue to enable. I believe this is the weakest chink in Google’s armor, and there’s at least some opportunity to someone promising “spam-free search results.” Be the ball.

Now for the “what would I do” part…

Launch with a great vanity search. When I read peoples’ experiences with Cuil or Bing, they invariably mention the egosearch they did. It’s a fairly safe bet that most people will search for their own names within their first few searches, and this experience should rock. When I google myself, I find my blog, my linkedin profile, my Flickr page, and other things I’d expect to see. The first day I Cuil’ed myself, I saw nothing, due to a technical error. A very very costly technical error. When I Bing myself, I see much of the “correct” results, but 2 entries for FriendFeed, 1 for Ether, and some other seeming obscurities – and I don’t want obscure first. I genuinely believe a good vanity search experience is a make/break issue for a new search player.

Do more than search. Part of what makes Google such a great product is the way it encompasses everything from a calculator to a flight tracker to movie listings and more. These extra features make it stickier, and make me as a user less likely to leave. So you have to play a similar game and offer services well beyond search. Personally, I’d do this through some form of open API that enables third parties to do the work, thus reducing your need to innovate across the board and build a richer community around your product.
Enable opt-out of purchasing (and more). Ever try searching for a product to find information, yet find nothing but sales links? Not only the Google product listings, but the first few pages tend to get dominated by companies who have mastered SEO to the point where they own virtually any search for a product. How about enabling the user to disable showing anything related to sales, giving them the benefit of the doubt that they just might want to do something other than a transaction? Now let’s take this further, and create a rich search experience that truly lets the searcher weed out the wrong kinds of results for the search they are conducting.
Understand context. Sure, Google’s great at surface-level context filtering. If you type in a stock ticker symbol, you get finance information. Search for a movie, you’ll see local playing times. But that’s about the end of the depth. Integrate services like Yahoo! Answers, Wikipedia, Wolfram, and more, and enable much deeper context about search. The biggest challenge people have with finding information is sifting through the data, so the more they are provided with usable results, the more they’ll like, trust, and rely on your service.
Deeply engage with the community. As far as I’ve seen, the forces behind other search engines have built their products, then invited people to come use them, and sometimes provide some feedback. I’d take a different approach and get the outside world involved in my product at an early stage, and keep it up well after the launch. Don’t confuse community engagement with a lack of product vision/direction, but take feedback from developers, advertisers, content creators, and other key sectors to help make sure you are building the right services all along the way. It’s important that any potential stakeholder (i.e. those whose efforts can directly contribute to the product’s success) have a mechanism for involvement.
Cross-program. I talk about this concept a lot, but you don’t counter CSI with another crime drama, you counter with a comedy or a medical romance or a quirky soap opera. Rather than be a different Google, be a different search concept. Go entirely visual. Go radically deep into data. Find information and surface in early results. Use more smarts. Provide more customization. Etc.
Oh, and build a good spidering and indexing system, blah blah blah. It’s probably important too.

So there are the first few steps to get going, mystery super-stealth search company 3.0. Please get the ball rolling, we need you soon. No offense to Google or anything, I just feel the world should not be so utterly dependent on a single company for any one thing. Just pretend it’s a semi-decent Bond flick (100% Lazenby-free), and the bad guy (let’s give him a scar, but no weird accent) really controls the links to all the information we consume. Okay, that metaphor starts falling apart right about now, but I trust you to see it through to it’s natural end.

6 thoughts on “How I'd Build a Search Engine”

Sally says:

June 11, 2009 at 9:31 am

Cool! Thanks Jeremy

Jane says:

June 12, 2009 at 3:30 pm

right you are, there should be competition!

calli says:

June 23, 2009 at 10:24 am

But I like using different search engines….

Although admittedly I do tend to fall back on Google for the vast majority of superficial searches.

So no,I don’t want any one application to answer all my online wants. Thats why we have personal portals ( federated searches ) sothat we can personalise and tailor our requirements. And because some aspects of certain engines/apps answer different queries….

http://searchwiki.wikispaces.com/

Kennboy1 says:

June 27, 2009 at 1:02 am

Right, there should be a competition because we can’t trust google WITH ALL the info and the links we need to get their.

Cameron says:

June 30, 2009 at 7:44 pm

Interesting article. I have no doubt that any new search engine that does all these things would become a serious competitor to Google.

SpectateSwamp says:

December 10, 2009 at 2:41 pm

My custom desktop search does a lot more than just search. It can pick random videos and then a random start point and play for a preset number of seconds. I can have it show me the first 5 seconds of every video file I have (thousands)
It does pictures music text and saves my favorite URL’s and my passwords the results are put in the clipboard. It does not index so there is no conflict with other searches. To speed the search for massive numbers of files. I first merge them into 1 huge file and search that. Displaying the name of the originating file in the border when a match is found. I can be in and see my last notes and out of the program in under 3 seconds. It’s the only program I need.

LIVEdigitally

How I'd Build a Search Engine

6 thoughts on “How I'd Build a Search Engine”

Leave a comment Cancel reply