• About

LIVEdigitally

How I'd Build a Search Engine

Posted on June 11, 2009 by Jeremy Toeman

Over the past year, we’ve seen no fewer than three major attempts to compete with Google come to market, with mixed results.  Cuil’s disastrous launch last Summer all but killed their chances of success.  Wolfram is in murky waters trying to remain distinctly not a search engine. And Microsoft is presently trying to buy their way onto the scene with Bing. I think there’s little-to-no debate that we should not let our entire worldview and information dissemination be in the hands of a single company (a recent study showed over 70% US market share – that’s just plain dangerous).

But how does one out-Google Google?  I’ll start with the “don’t”s, the things that should be avoided.

  • Bigger ain’t better. Nobody, and I mean nobody, cares that your search index is 17.37835% deeper than Google’s.  Search is at worst a semi-solved problem, which means the average person looking for something online probably perceives Google will have it.
  • Don’t confuse simple with simplistic. If you didn’t know, Google’s design is as simplistic as it is because they didn’t really know HTML when they started!  For some reason, this has caused legions of lookalikes, despite plenty of better examples of usability design.
  • Overhype can be bad. It’s a tricky challenge, but to even play in this space it is essential to set, then meet or exceed expectations.  Wolfram is doing this fairly well, Cuil did it badly.
  • Don’t enable spam. If there’s one evil Google is blatantly doing, it’s enabling spam.  Spam blogs. Spam news sites. Spam landing pages. The sheer ability to create and profit from a hosted domain with nothing but uselessness is a horror Google has unleashed upon us, and is unfortunately financially motivated to continue to enable.  I believe this is the weakest chink in Google’s armor, and there’s at least some opportunity to someone promising “spam-free search results.”  Be the ball.

Now for the “what would I do” part…

  • Launch with a great vanity search. When I read peoples’ experiences with Cuil or Bing, they invariably mention the egosearch they did.  It’s a fairly safe bet that most people will search for their own names within their first few searches, and this experience should rock.  When I google myself, I find my blog, my linkedin profile, my Flickr page, and other things I’d expect to see.  The first day I Cuil’ed myself, I saw nothing, due to a technical error.  A very very costly technical error.  When I Bing myself, I see much of the “correct” results, but 2 entries for FriendFeed, 1 for Ether, and some other seeming obscurities – and I don’t want obscure first.  I genuinely believe a good vanity search experience is a make/break issue for a new search player.notjeremybing
  • Do more than search. Part of what makes Google such a great product is the way it encompasses everything from a calculator to a flight tracker to movie listings and more.  These extra features make it stickier, and make me as a user less likely to leave.  So you have to play a similar game and offer services well beyond search.  Personally, I’d do this through some form of open API that enables third parties to do the work, thus reducing your need to innovate across the board and build a richer community around your product.
  • Enable opt-out of purchasing (and more). Ever try searching for a product to find information, yet find nothing but sales links?  Not only the Google product listings, but the first few pages tend to get dominated by companies who have mastered SEO to the point where they own virtually any search for a product.  How about enabling the user to disable showing anything related to sales, giving them the benefit of the doubt that they just might want to do something other than a transaction?  Now let’s take this further, and create a rich search experience that truly lets the searcher weed out the wrong kinds of results for the search they are conducting.
  • Understand context. Sure, Google’s great at surface-level context filtering. If you type in a stock ticker symbol, you get finance information. Search for a movie, you’ll see local playing times.  But that’s about the end of the depth.  Integrate services like Yahoo! Answers, Wikipedia, Wolfram, and more, and enable much deeper context about search.  The biggest challenge people have with finding information is sifting through the data, so the more they are provided with usable results, the more they’ll like, trust, and rely on your service.
  • Deeply engage with the community. As far as I’ve seen, the forces behind other search engines have built their products, then invited people to come use them, and sometimes provide some feedback.  I’d take a different approach and get the outside world involved in my product at an early stage, and keep it up well after the launch.  Don’t confuse community engagement with a lack of product vision/direction, but take feedback from developers, advertisers, content creators, and other key sectors to help make sure you are building the right services all along the way.  It’s important that any potential stakeholder (i.e. those whose efforts can directly contribute to the product’s success) have a mechanism for involvement.
  • Cross-program. I talk about this concept a lot, but you don’t counter CSI with another crime drama, you counter with a comedy or a medical romance or a quirky soap opera.  Rather than be a different Google, be a different search concept.  Go entirely visual.  Go radically deep into data.  Find information and surface in early results.  Use more smarts.  Provide more customization.  Etc.
  • Oh, and build a good spidering and indexing system, blah blah blah.  It’s probably important too.

So there are the first few steps to get going, mystery super-stealth search company 3.0.  Please get the ball rolling, we need you soon.  No offense to Google or anything, I just feel the world should not be so utterly dependent on a single company for any one thing.  Just pretend it’s a semi-decent Bond flick (100% Lazenby-free), and the bad guy (let’s give him a scar, but no weird accent) really controls the links to all the information we consume.  Okay, that metaphor starts falling apart right about now, but I trust you to see it through to it’s natural end.

Share this:

  • Email
  • Facebook
  • LinkedIn
  • Twitter
  • Reddit

Related

Posted in Web/Internet | Tags: google, search | 6 Comments
« Five Random Thoughts: Beatles Rock Band, Social Gaming, beeTV, CrunchPad, Palm Pre
Thoughts on… Dell Buying Palm, Facebook Vanity URLs, Building43, Project Natal, Content Consumption »

6 thoughts on “How I'd Build a Search Engine”

  1. Sally says:
    June 11, 2009 at 9:31 am

    Cool! Thanks Jeremy

    Reply
  2. Jane says:
    June 12, 2009 at 3:30 pm

    right you are, there should be competition!

    Reply
  3. calli says:
    June 23, 2009 at 10:24 am

    But I like using different search engines….

    Although admittedly I do tend to fall back on Google for the vast majority of superficial searches.

    So no,I don’t want any one application to answer all my online wants. Thats why we have personal portals ( federated searches ) sothat we can personalise and tailor our requirements. And because some aspects of certain engines/apps answer different queries….

    http://searchwiki.wikispaces.com/

    Reply
  4. Kennboy1 says:
    June 27, 2009 at 1:02 am

    Right, there should be a competition because we can’t trust google WITH ALL the info and the links we need to get their.

    Reply
  5. Cameron says:
    June 30, 2009 at 7:44 pm

    Interesting article. I have no doubt that any new search engine that does all these things would become a serious competitor to Google.

    Reply
  6. SpectateSwamp says:
    December 10, 2009 at 2:41 pm

    My custom desktop search does a lot more than just search. It can pick random videos and then a random start point and play for a preset number of seconds. I can have it show me the first 5 seconds of every video file I have (thousands)
    It does pictures music text and saves my favorite URL’s and my passwords the results are put in the clipboard. It does not index so there is no conflict with other searches. To speed the search for massive numbers of files. I first merge them into 1 huge file and search that. Displaying the name of the originating file in the border when a match is found. I can be in and see my last notes and out of the program in under 3 seconds. It’s the only program I need.

    Reply

Leave a comment Cancel reply

Your email address will not be published. Required fields are marked *

About

Jeremy Toeman is a seasoned Product leader with over 20 years experience in the convergence of digital media, mobile entertainment, social entertainment, smart TV and consumer technology. Prior ventures and projects include CNET, Viggle/Dijit/Nextguide, Sling Media, VUDU, Clicker, DivX, Rovi, Mediabolic, Boxee, and many other consumer technology companies. This blog represents his personal opinion and outlook on things.

Recent Posts

  • Back on the wagon/horse?
  • 11 Tips for Startups Pitching Big Companies
  • CES 2016: A New Role
  • Everything I Learned (So Far) Working For a Huge Company
  • And I’m Back…

Archives

Pages

  • About

Archives

  • January 2019
  • April 2016
  • January 2016
  • December 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • January 2014
  • December 2013
  • September 2013
  • August 2013
  • July 2013
  • May 2013
  • February 2013
  • January 2013
  • December 2012
  • October 2012
  • September 2012
  • August 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • June 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • May 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005
  • October 2005
  • September 2005
  • August 2005
  • July 2005
  • June 2005
  • May 2005
  • April 2005
  • March 2005
  • February 2005
  • January 2005
  • December 2004
  • November 2004
  • October 2004
  • September 2004

Categories

  • Convergence (81)
  • Gadgets (144)
  • Gaming (19)
  • General (999)
  • Guides (35)
  • LD Approved (72)
  • Marketing (23)
  • Mobile Technology (111)
  • Networking (22)
  • No/Low-tech (64)
  • Product Announcements (85)
  • Product Reviews (109)
  • That's Janky (93)
  • Travel (29)
  • Video/Music/Media (115)
  • Web/Internet (103)

WordPress

  • Log in
  • WordPress

CyberChimps WordPress Themes

© LIVEdigitally
loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.