Bing vs Google, the quiet semantic war

On Wednesday night I had dinner at a burger joint with four old friends; two work in the intelligence community today on top-secret programs, and two others are technologists in the private sector who have done IC work for years. The five of us share a particular interest besides good burgers: semantic technology.

Oh, we talked about mobile phones (iPhones were whipped out as was my Windows Phone, and apps debated) and cloud storage (they were stunned that Microsoft gives 25 gigabytes of free cloud storage with free Skydrive accounts, compared to the puny 2 gig they’d been using on DropBox).

But we kept returning to semantic web discussions, semantic approaches, semantic software. One of these guys goes back to the DAML days of DARPA fame, the guys on the government side are using semantic software operationally, and we all are firm believers in Our Glorious Semantic Future.

So today, while CNBC, the blogosphere, and Twitter were all abuzz with the Apple press conference where Steve Jobs apologized for iPhone 4 glitches and tried to mollify the fanboi mob, I was paying attention to the BIG tech news of the day flying under the radar. It was happening elsewhere on the San Francisco peninsula, just a couple miles away  from Cupertino at the Googleplex, where the search giant announced their acquisition of the semantic startup Metaweb

Here’s a quick and friendly video by Metaweb explaining their approach to representing semantic meaning:

Let’s think about what’s going on here, a little more deeply.  Metaweb is based on the open-data Freebase, a partially crowd-sourced repository of entities, with disambiguation and relationships. The Google explanation published today of why they bought Metaweb, and what they plan to do with it, has a particular ring to it:

…we’re just beginning to apply our understanding of the web to make search better. Type [barack obama birthday] in the search box and see the answer right at the top of the page. Or search for [events in San Jose] and see a list of specific events and dates. We can offer this kind of experience because we understand facts about real people and real events out in the world. But what about [colleges on the west coast with tuition under $30,000] or [actors over 40 who have won at least one oscar]? These are hard questions, and we’ve acquired Metaweb because we believe working together we’ll be able to provide better answers. The Official Google Blog, 7/16/2010

Now let’s look back at what Microsoft’s Bing team wrote when that “decision-engine” launched in June 2009, where we find this approach:

…we shifted our thinking a bit and moved from single queries to complex sessions, task accomplishment and decision making as organizing principles…. In Bing we took a novel approach for organizing our search results. Instead of applying simple classification techniques, we constructed user query and click graphs and used them to build true interaction models that can represent complex user tasks. This has allowed us to adapt from general to intent-specific ranking and to organize results into sets of topics that can be used to help find information, make decisions and complete tasks. We also invested in technologies and algorithms for extracting structure from unstructured data and applying organizational taxonomies…. We have also enriched our index by developing technologies in HTML parsing, core Natural Language Processing, entity extraction, and document classification. – The Bing Search Blog, 6/01/2009

So Google is following in good footsteps with the “complex decision” approach. Both Bing and Google are building on three decades (or more!) of work, of course, in semantic software and algorithms.  Nothing’s all that revolutionary; the “barcode” analogy which Metaweb uses in that video for delineating individual entities and their disambiguated meaning was one which I and others innovated a decade ago, with the San Francisco startup H5 Technologies.

What is new, is the global web-scale war now gearing up between major computing powers Google and Microsoft, in a race for performance and innovation in semantically-enabled decision support.

Others are still involved in this battle (WolframAlpha for example), but I would contend that these two giants now represent the central front.  Google’s approach until now had been unclear, given their traditional reliance on keyword boolean search, despite several minor feints in a semantic direction. But now there’s more clarity to the vision. The Metaweb acquisition is likely to have been expensive (the startup had raised at least $57 million in venture funding in the past three years according to Mashable.com, giving them a significant valuation), so the Goog is making a definitive bet.  

Meanwhile, Microsoft has been busy using to its advantage our existing strong software stack and our shift to cloud services, into which we can embed semantic techniques and power. We continue to build upon Powerset technology for example (I first wrote about that back in 2008).  And it was just six months ago – but seems like years – that I wrote about our work with our Microsoft Semantic Engine, its REST API layer, the pluggable services including inference engines, ontology and taxonomy managers, image and text processors.  The vision isn’t fully fleshed out yet – perhaps it needn’t be – yet there’s a lot of exciting work going on.

With the Metaweb acquisition, I see the battlefront more clearly, and I gotta say it’s fun to watch. The real winners: software users, “the rest of us.” As these platforms develop we’ll benefit from increasingly powerful, increasingly knowledge-based, computationally-powered systems/ phones/ cars/ offices/ houses/ walls/ toasters… with seemingly human ability to understand and anticipate our intent. Turns out, there’s money in that… and thus the semantic war quietly advances.

Share this post on Twitter

Email this post to a friend

AddThis Social Bookmark Button

17 Responses

  1. I liked my short stay in Reston while I was there for CACI training. Lots of great places to eat.

    I am even more interested in this acquisition. I have also been wondering what the major giants of search were going to do with the “semantic web”. Very good competition it seems of which all are winners in their own ways, and all for our benefit.

    I also think perhaps that their vision is somewhat cloudy and unformed due to the complexity of the subject, and the uncertainty on its future. Its a very allusive tech to try and capture and put to good use, but I do agree it’s movement towards aiding,

    Furthermore it will aid a bit further with other software developers, like me, to use the various API’s being developed to their own use. Who knows what will spin off of these things.

    Thanks for sharing and writing about this topic. Its always nice to hear from a someone in the center of this world who understands the topic and can dictate clearly.

    Like

    • Thanks very much, Ross. I agree with your stress on the API and platformish aspects, as important pieces of all this. I’m pretty sure we can’t even envision the uses that will be made of embedded semantic techniques and technologies – if the mashups and coordinated web-services of the Web 2.0 world are any guide, it’s going to be wild and wonderful.
      thanks again – lewis

      Like

  2. Nice post, Lewis. Rumsfeld’s world (“There are known knowns.There are things we know we know. We also know
    There are known unknowns. That is to say we know there are some things we do not know. But there are also unknown unknowns, The ones we don’t know we don’t know.”) comes to mind. Some of the known unknowns and even the unknown unknowns have to do with context. More and more of that context is becoming explicit via graphs. With more of that context, we will have access to better filtering. Without it, the amount of noise often overwhelms the amount of signal.

    So owning the graph that becomes explicit through human interaction will loom as large as owning repositories and the graphs we can deduce or infer from them.

    Like

  3. Fascinating article, Lewis. Would you say that Microsoft’s acquisition of FAST is comparable to Google’s acquisition of MetaWeb?

    Like

    • Thanks Jeff – we aim to please. I’d actually say the more direct comparison is to Microsoft’s 2008 purchase of Powerset. That’s already been proving quite useful to Bing….

      Like

  4. […] This post was Twitted by TechnologyTeam […]

    Like

  5. […] check out this opinion from a Microsofter with ties to the IC, Lewis Shepherd.  He calls it the “semantic […]

    Like

  6. […] This post was Twitted by chickfoxgrover […]

    Like

  7. Hey Lew: Interesting stuff…..the video was good, but your positive outlook on Bing’s competition puts my mind at rest….also your reference to the work you did at H5 was interesting…..you’ve had an interesting background and good training for what you are able to accomplish for Microsoft…..keep it up, the folks seem to like what you write! All the best!

    Like

  8. Honing in on two words, you place Bing’s self-description as a “decision engine” in quotations like I am here. Why do you quotation it? Because Bing says it that way?

    Like

  9. Hi Ari – yes, I put it in quotation marks just because the phrase is still finding its way as a separate category from search engines. In the vernacular, most people still think of and label Bing as “search,” even though almost no one would label WolframAlpha that. So, while the hybrid category develops, I thought I’d reference the nascency. Thanks for the question -lewis

    Like

  10. Nice post Lewis. Sorry that I am slow to it – I was on vacation. 😉

    My first thought when I heard about this acquisition was that Google Squared could actually become useful.

    I’m looking forward to both Bing and Google showing more of what they can do in this space.

    Like

  11. good post

    Like

  12. Blog post subbed, interesting blog post.

    Like

  13. “Smart stuff, I look forward to reading more.”

    Like

  14. […] MetaWeb. Ein Multi-Million-Dollar-Deal! Schon sprechen Kenner der Szene vom ” semantischen Krieg” mit Microsoft. Übertrieben? Ein harter Wettkampf ist es allemal. Gleiches gilt im […]

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: