On Wednesday night I had dinner at a burger joint with four old friends; two work in the intelligence community today on top-secret programs, and two others are technologists in the private sector who have done IC work for years. The five of us share a particular interest besides good burgers: semantic technology.
Oh, we talked about mobile phones (iPhones were whipped out as was my Windows Phone, and apps debated) and cloud storage (they were stunned that Microsoft gives 25 gigabytes of free cloud storage with free Skydrive accounts, compared to the puny 2 gig they’d been using on DropBox).
But we kept returning to semantic web discussions, semantic approaches, semantic software. One of these guys goes back to the DAML days of DARPA fame, the guys on the government side are using semantic software operationally, and we all are firm believers in Our Glorious Semantic Future.
So today, while CNBC, the blogosphere, and Twitter were all abuzz with the Apple press conference where Steve Jobs apologized for iPhone 4 glitches and tried to mollify the fanboi mob, I was paying attention to the BIG tech news of the day flying under the radar. It was happening elsewhere on the San Francisco peninsula, just a couple miles away from Cupertino at the Googleplex, where the search giant announced their acquisition of the semantic startup Metaweb.
Here’s a quick and friendly video by Metaweb explaining their approach to representing semantic meaning:
Let’s think about what’s going on here, a little more deeply. Metaweb is based on the open-data Freebase, a partially crowd-sourced repository of entities, with disambiguation and relationships. The Google explanation published today of why they bought Metaweb, and what they plan to do with it, has a particular ring to it:
…we’re just beginning to apply our understanding of the web to make search better. Type [barack obama birthday] in the search box and see the answer right at the top of the page. Or search for [events in San Jose] and see a list of specific events and dates. We can offer this kind of experience because we understand facts about real people and real events out in the world. But what about [colleges on the west coast with tuition under $30,000] or [actors over 40 who have won at least one oscar]? These are hard questions, and we’ve acquired Metaweb because we believe working together we’ll be able to provide better answers. – The Official Google Blog, 7/16/2010
Now let’s look back at what Microsoft’s Bing team wrote when that “decision-engine” launched in June 2009, where we find this approach:
…we shifted our thinking a bit and moved from single queries to complex sessions, task accomplishment and decision making as organizing principles…. In Bing we took a novel approach for organizing our search results. Instead of applying simple classification techniques, we constructed user query and click graphs and used them to build true interaction models that can represent complex user tasks. This has allowed us to adapt from general to intent-specific ranking and to organize results into sets of topics that can be used to help find information, make decisions and complete tasks. We also invested in technologies and algorithms for extracting structure from unstructured data and applying organizational taxonomies…. We have also enriched our index by developing technologies in HTML parsing, core Natural Language Processing, entity extraction, and document classification. – The Bing Search Blog, 6/01/2009
So Google is following in good footsteps with the “complex decision” approach. Both Bing and Google are building on three decades (or more!) of work, of course, in semantic software and algorithms. Nothing’s all that revolutionary; the “barcode” analogy which Metaweb uses in that video for delineating individual entities and their disambiguated meaning was one which I and others innovated a decade ago, with the San Francisco startup H5 Technologies.
What is new, is the global web-scale war now gearing up between major computing powers Google and Microsoft, in a race for performance and innovation in semantically-enabled decision support.
Others are still involved in this battle (WolframAlpha for example), but I would contend that these two giants now represent the central front. Google’s approach until now had been unclear, given their traditional reliance on keyword boolean search, despite several minor feints in a semantic direction. But now there’s more clarity to the vision. The Metaweb acquisition is likely to have been expensive (the startup had raised at least $57 million in venture funding in the past three years according to Mashable.com, giving them a significant valuation), so the Goog is making a definitive bet.
Meanwhile, Microsoft has been busy using to its advantage our existing strong software stack and our shift to cloud services, into which we can embed semantic techniques and power. We continue to build upon Powerset technology for example (I first wrote about that back in 2008). And it was just six months ago – but seems like years – that I wrote about our work with our Microsoft Semantic Engine, its REST API layer, the pluggable services including inference engines, ontology and taxonomy managers, image and text processors. The vision isn’t fully fleshed out yet – perhaps it needn’t be – yet there’s a lot of exciting work going on.
With the Metaweb acquisition, I see the battlefront more clearly, and I gotta say it’s fun to watch. The real winners: software users, “the rest of us.” As these platforms develop we’ll benefit from increasingly powerful, increasingly knowledge-based, computationally-powered systems/ phones/ cars/ offices/ houses/ walls/ toasters… with seemingly human ability to understand and anticipate our intent. Turns out, there’s money in that… and thus the semantic war quietly advances.
Filed under: Technology | Tagged: acquisition, ai, Apple, Bing, cloud, cloud computing, CNBC, computer, DAML, DARPA, decision engine, DropBox, entity-extraction, Freebase, Google, Googleplex, H5, IC, Intelligence, Intelligence Community, iPhone, M&A, Mashable, Metaweb, Microsoft Semantic Engine, MSE, REST, Reston, San Francisco, San Jose, search, search engine, semantic, semantic web, Skydrive, SQL, SQL Server, Steve Jobs, Twitter, VC, web, Web30, web40, Windows Phone |
I liked my short stay in Reston while I was there for CACI training. Lots of great places to eat.
I am even more interested in this acquisition. I have also been wondering what the major giants of search were going to do with the “semantic web”. Very good competition it seems of which all are winners in their own ways, and all for our benefit.
I also think perhaps that their vision is somewhat cloudy and unformed due to the complexity of the subject, and the uncertainty on its future. Its a very allusive tech to try and capture and put to good use, but I do agree it’s movement towards aiding,
Furthermore it will aid a bit further with other software developers, like me, to use the various API’s being developed to their own use. Who knows what will spin off of these things.
Thanks for sharing and writing about this topic. Its always nice to hear from a someone in the center of this world who understands the topic and can dictate clearly.
LikeLike
Thanks very much, Ross. I agree with your stress on the API and platformish aspects, as important pieces of all this. I’m pretty sure we can’t even envision the uses that will be made of embedded semantic techniques and technologies – if the mashups and coordinated web-services of the Web 2.0 world are any guide, it’s going to be wild and wonderful.
thanks again – lewis
LikeLike
Nice post, Lewis. Rumsfeld’s world (“There are known knowns.There are things we know we know. We also know
There are known unknowns. That is to say we know there are some things we do not know. But there are also unknown unknowns, The ones we don’t know we don’t know.”) comes to mind. Some of the known unknowns and even the unknown unknowns have to do with context. More and more of that context is becoming explicit via graphs. With more of that context, we will have access to better filtering. Without it, the amount of noise often overwhelms the amount of signal.
So owning the graph that becomes explicit through human interaction will loom as large as owning repositories and the graphs we can deduce or infer from them.
LikeLike
I believe you’re correct, Alan, about the importance of the social/commercial graphs to be derived. Build the platforms and the analytics will come…
LikeLike
Fascinating article, Lewis. Would you say that Microsoft’s acquisition of FAST is comparable to Google’s acquisition of MetaWeb?
LikeLike
Thanks Jeff – we aim to please. I’d actually say the more direct comparison is to Microsoft’s 2008 purchase of Powerset. That’s already been proving quite useful to Bing….
LikeLike
[…] This post was Twitted by TechnologyTeam […]
LikeLike
[…] check out this opinion from a Microsofter with ties to the IC, Lewis Shepherd. He calls it the “semantic […]
LikeLike
[…] This post was Twitted by chickfoxgrover […]
LikeLike
Hey Lew: Interesting stuff…..the video was good, but your positive outlook on Bing’s competition puts my mind at rest….also your reference to the work you did at H5 was interesting…..you’ve had an interesting background and good training for what you are able to accomplish for Microsoft…..keep it up, the folks seem to like what you write! All the best!
LikeLike
Honing in on two words, you place Bing’s self-description as a “decision engine” in quotations like I am here. Why do you quotation it? Because Bing says it that way?
LikeLike
Hi Ari – yes, I put it in quotation marks just because the phrase is still finding its way as a separate category from search engines. In the vernacular, most people still think of and label Bing as “search,” even though almost no one would label WolframAlpha that. So, while the hybrid category develops, I thought I’d reference the nascency. Thanks for the question -lewis
LikeLike
Nice post Lewis. Sorry that I am slow to it – I was on vacation. 😉
My first thought when I heard about this acquisition was that Google Squared could actually become useful.
I’m looking forward to both Bing and Google showing more of what they can do in this space.
LikeLike
good post
LikeLike
Blog post subbed, interesting blog post.
LikeLike
“Smart stuff, I look forward to reading more.”
LikeLike
[…] MetaWeb. Ein Multi-Million-Dollar-Deal! Schon sprechen Kenner der Szene vom ” semantischen Krieg” mit Microsoft. Übertrieben? Ein harter Wettkampf ist es allemal. Gleiches gilt im […]
LikeLike