On Wednesday night I had dinner at a burger joint with four old friends; two work in the intelligence community today on top-secret programs, and two others are technologists in the private sector who have done IC work for years. The five of us share a particular interest besides good burgers: semantic technology.
Oh, we talked about mobile phones (iPhones were whipped out as was my Windows Phone, and apps debated) and cloud storage (they were stunned that Microsoft gives 25 gigabytes of free cloud storage with free Skydrive accounts, compared to the puny 2 gig they’d been using on DropBox).
But we kept returning to semantic web discussions, semantic approaches, semantic software. One of these guys goes back to the DAML days of DARPA fame, the guys on the government side are using semantic software operationally, and we all are firm believers in Our Glorious Semantic Future.
So today, while CNBC, the blogosphere, and Twitter were all abuzz with the Apple press conference where Steve Jobs apologized for iPhone 4 glitches and tried to mollify the fanboi mob, I was paying attention to the BIG tech news of the day flying under the radar. It was happening elsewhere on the San Francisco peninsula, just a couple miles away from Cupertino at the Googleplex, where the search giant announced their acquisition of the semantic startup Metaweb.
Here’s a quick and friendly video by Metaweb explaining their approach to representing semantic meaning:
Let’s think about what’s going on here, a little more deeply. Metaweb is based on the open-data Freebase, a partially crowd-sourced repository of entities, with disambiguation and relationships. The Google explanation published today of why they bought Metaweb, and what they plan to do with it, has a particular ring to it:
…we’re just beginning to apply our understanding of the web to make search better. Type [barack obama birthday] in the search box and see the answer right at the top of the page. Or search for [events in San Jose] and see a list of specific events and dates. We can offer this kind of experience because we understand facts about real people and real events out in the world. But what about [colleges on the west coast with tuition under $30,000] or [actors over 40 who have won at least one oscar]? These are hard questions, and we’ve acquired Metaweb because we believe working together we’ll be able to provide better answers. - The Official Google Blog, 7/16/2010
Now let’s look back at what Microsoft’s Bing team wrote when that “decision-engine” launched in June 2009, where we find this approach:
…we shifted our thinking a bit and moved from single queries to complex sessions, task accomplishment and decision making as organizing principles…. In Bing we took a novel approach for organizing our search results. Instead of applying simple classification techniques, we constructed user query and click graphs and used them to build true interaction models that can represent complex user tasks. This has allowed us to adapt from general to intent-specific ranking and to organize results into sets of topics that can be used to help find information, make decisions and complete tasks. We also invested in technologies and algorithms for extracting structure from unstructured data and applying organizational taxonomies…. We have also enriched our index by developing technologies in HTML parsing, core Natural Language Processing, entity extraction, and document classification. – The Bing Search Blog, 6/01/2009
So Google is following in good footsteps with the “complex decision” approach. Both Bing and Google are building on three decades (or more!) of work, of course, in semantic software and algorithms. Nothing’s all that revolutionary; the “barcode” analogy which Metaweb uses in that video for delineating individual entities and their disambiguated meaning was one which I and others innovated a decade ago, with the San Francisco startup H5 Technologies.
What is new, is the global web-scale war now gearing up between major computing powers Google and Microsoft, in a race for performance and innovation in semantically-enabled decision support.
Others are still involved in this battle (WolframAlpha for example), but I would contend that these two giants now represent the central front. Google’s approach until now had been unclear, given their traditional reliance on keyword boolean search, despite several minor feints in a semantic direction. But now there’s more clarity to the vision. The Metaweb acquisition is likely to have been expensive (the startup had raised at least $57 million in venture funding in the past three years according to Mashable.com, giving them a significant valuation), so the Goog is making a definitive bet.
Meanwhile, Microsoft has been busy using to its advantage our existing strong software stack and our shift to cloud services, into which we can embed semantic techniques and power. We continue to build upon Powerset technology for example (I first wrote about that back in 2008). And it was just six months ago – but seems like years – that I wrote about our work with our Microsoft Semantic Engine, its REST API layer, the pluggable services including inference engines, ontology and taxonomy managers, image and text processors. The vision isn’t fully fleshed out yet – perhaps it needn’t be – yet there’s a lot of exciting work going on.
With the Metaweb acquisition, I see the battlefront more clearly, and I gotta say it’s fun to watch. The real winners: software users, “the rest of us.” As these platforms develop we’ll benefit from increasingly powerful, increasingly knowledge-based, computationally-powered systems/ phones/ cars/ offices/ houses/ walls/ toasters… with seemingly human ability to understand and anticipate our intent. Turns out, there’s money in that… and thus the semantic war quietly advances.
Filed under: Technology Tagged: | acquisition, ai, Apple, Bing, cloud, cloud computing, CNBC, computer, DAML, DARPA, decision engine, DropBox, entity-extraction, Freebase, Google, Googleplex, H5, IC, Intelligence, Intelligence Community, iPhone, M&A, Mashable, Metaweb, Microsoft Semantic Engine, MSE, REST, Reston, San Francisco, San Jose, search, search engine, semantic, semantic web, Skydrive, SQL, SQL Server, Steve Jobs, Twitter, VC, web, Web30, web40, Windows Phone