Bullshit Detector Prototype Goes Live

I like writing about cool applications of technology that are so pregnant with the promise of the future, that they have to be seen to be believed, and here’s another one that’s almost ready for prime time.

TruthTeller PrototypeThe Washington Post today launched an exciting new technology prototype invoking powerful new technologies for journalism and democratic accountability in politics and government. As you can see from the screenshot (left), it runs an automated fact-checking algorithm against the streaming video of politicians or other talking heads and displays in real time a “True” or “False” label as they’re speaking.

Called “Truth Teller,” the system uses technologies from Microsoft Research and Windows Azure cloud-computing services (I have included some of the technical details below).

But first, a digression on motivation. Back in the late 1970s I was living in Europe and was very taken with punk rock. Among my favorite bands were the UK’s anarcho-punk collective Crass, and in 1980 I bought their compilation LP “Bullshit Detector,” whose title certainly appealed to me because of my equally avid interest in politics 🙂

Today, my driving interests are in the use of novel or increasingly powerful technologies for the public good, by government agencies or in the effort to improve the performance of government functions. Because of my Jeffersonian tendencies (I did after all take a degree in Government at Mr. Jefferson’s University of Virginia), I am even more interested in improving government accountability and popular control over the political process itself, and I’ve written or spoken often about the “Government 2.0” movement.

In an interview with GovFresh several years ago, I was asked: “What’s the killer app that will make Gov 2.0 the norm instead of the exception?”

My answer then looked to systems that might “maintain the representative aspect (the elected official, exercising his or her judgment) while incorporating real-time, structured, unfiltered but managed visualizations of popular opinion and advice… I’m also a big proponent of semantic computing – called Web 3.0 by some – and that should lead the worlds of crowdsourcing, prediction markets, and open government data movements to unfold in dramatic, previously unexpected ways. We’re working on cool stuff like that.”

The Truth Teller prototype is an attempt to construct a rudimentary automated Political Bullshit Detector, and addresses each of those factors I mentioned in GovFresh – recognizing the importance of political leadership and its public communication, incorporating iterative aspects of public opinion and crowd wisdom, all while imbuing automated systems with semantic sense-making technology to operate at the speed of today’s real world.

Real-time politics? Real-time truth detection.  Or at least that’s the goal; this is just a budding prototype, built in three months.

Cory Haik, who is the Post’s Executive Producer for Digital News, says it “aims to fact-check speeches in as close to real time as possible” in speeches, TV ads, or interviews. Here’s how it works:

The Truth Teller prototype was built and runs with a combination of several technologies — some new, some very familiar. We’ve combined video and audio extraction with a speech-to-text technology to search a database of facts and fact checks. We are effectively taking in video, converting the audio to text (the rough transcript below the video), matching that text to our database, and then displaying, in real time, what’s true and what’s false.

We are transcribing videos using Microsoft Audio Video indexing service (MAVIS) technology. MAVIS is a Windows Azure application which uses State of the Art of Deep Neural Net (DNN) based speech recognition technology to convert audio signals into words. Using this service, we are extracting audio from videos and saving the information in our Lucene search index as a transcript. We are then looking for the facts in the transcription. Finding distinct phrases to match is difficult. That’s why we are focusing on patterns instead.

We are using approximate string matching or a fuzzy string searching algorithm. We are implementing a modified version Rabin-Karp using Levenshtein distance algorithm as our first implementation. This will be modified to recognize paraphrasing, negative connotations in the future.

What you see in the prototype is actual live fact checking — each time the video is played the fact checking starts anew.

 – Washington Post, “Debuting Truth Teller

The prototype was built with funding from a Knight Foundation’s Prototype Fund grant, and you can read more about the motivation and future plans over on the Knight Blog, and you can read TechCrunch discussing some of the political ramifications of the prototype based on the fact-checking movement in recent campaigns.

Even better, you can actually give Truth Teller a try here, in its infancy.

What other uses could be made of semantic “truth detection” or fact-checking, in other aspects of the relationship between the government and the governed?

Could the justice system use something like Truth Teller, or will human judges and  juries always have a preeminent role in determining the veracity of testimony? Will police officers and detectives be able to use cloud-based mobile services like Truth Teller in real time during criminal investigations as they’re evaluating witness accounts? Should the Intelligence Community be running intercepts of foreign terrorist suspects’ communications through a massive look-up system like Truth Teller?

Perhaps, and time will tell how valuable – or error-prone – these systems can be. But in the next couple of years we will be developing (and be able to assess the adoption of) increasingly powerful semantic systems against big-data collections, using faster and faster cloud-based computing architectures.

In the meantime, watch for further refinements and innovation from The Washington Post’s prototyping efforts; after all, we just had a big national U.S.  election but congressional elections in 2014 and the presidential race in 2016 are just around the corner. Like my fellow citizens, I will be grateful for any help in keeping candidates accountable to something resembling “the truth.”

To fix intelligence analysis you have to decide what’s broken

“More and more, Xmas Day failure looks to be wheat v. chaff issue, not info sharing issue.” – Marc Ambinder, politics editor for The Atlantic, on Twitter last night.

Marc Ambinder, a casual friend and solid reporter, has boiled down two likely avenues of intelligence “failure” relevant to the case of Umar Farouk Abdulmutallab and his attempted Christmas Day bombing on Northwest Airlines Flight 253.  In his telling, they’re apparently binary – one is true, not the other, at least for this case.

The two areas were originally signalled by President Obama in his remarks on Tuesday, when he discussed the preliminary findings of “a review of our terrorist watch list system …  so we can find out what went wrong, fix it and prevent future attacks.” 

Let’s examine these two areas of failure briefly – and what can and should be done to address them.

Continue reading

Inside Cyber Warfare

One year ago, the buzz across the government/technology nexus was focused on a pair of political guessing games. Neophytes mostly engaged in debating over whom the newly-elected President would name to be the nation’s first Chief Technology Officer. Grizzled Pentagon veterans and the more sober Silicon Valley types wondered instead who would get the nod as President Obama’s “Cyber Czar.”

Continue reading

Stop by Tuesday for dinner or a drink with a great guy (and me)

I’m taking up my duties as a public-spirited citizen next week on Tuesday evening, by hosting a fun little fundraiser for my local Congressman – and if you’re going to be in Washington DC the evening of 10/27 I hope you’ll join me (click here for the invitation and details). It’ll be a fun evening; we’ll be at the ultra-cool Johnny’s Half-Shell on Capitol Hill after all.

Continue reading

How the Crowd Reads Crowd-Sourced News

It turns out that we have lessons to learn from Uganda – more specifically, from web coverage of events in Uganda this week.

I’m constantly trying to improve my own ability to follow real-time world events, whether through social media, advanced search technologies, or aggregation of multiple old/new information technologies. About this time last year, as the Georgian-Russian skirmishes were just kicking off, I wrote about keeping up with information on international events (“Using Web 2.0 to Track a Political Crisis“).

In the intervening year, development of real-time tools and techniques has really blossomed. This past week, the onset of violent political unrest in Uganda has served as yet another crucible in which new techniques and web-based technologies can be tested and tweaked.

Continue reading

Way Ahead and Far Behind

Today’s Washington Post has a story on its front page: “Staff Finds White House in the Technological Dark Ages.”

Two years after launching the most technologically savvy presidential campaign in history, Obama officials ran smack into the constraints of the federal bureaucracy yesterday, encountering a jumble of disconnected phone lines, old computer software, and security regulations forbidding outside e-mail accounts.”

“What does that mean in 21st-century terms? No Facebook to communicate with supporters. No outside e-mail log-ins. No instant messaging. Hard adjustments for a staff that helped sweep Obama to power through, among other things, relentless online social networking.”  -Washington Post

Some say that whoever has been responsible for information technology in the White House itself should be fired — but then perhaps the change of Administration just took care of that  🙂 

Overall, this situation is familiar to anyone who has worked in what I call “Big-G  IT” or the information technology of a federal government agency. I’ve argued about its challenges and sub-optimality before: see my previous pieces on “Roadmap for Innovation: From the Center to the Edge,” and more specifically “Puncturing Circles of Bureaucracy.”  In that latter piece back in March of 2008, I wrote about the “the defensive perimeters of overwhelming bureaucratic torpor,” and the frustrating reality within much of Big Government: “Federal employees have an entire complex of bizarrely-incented practices and career motivations, which make progress on technology innovation very difficult, not to mention general business-practice transformation as a whole.”

Here’s the truly frustrating, mind-bending part: it isn’t always true!  Other elements of the White House have cutting-edge, world-class technologies operating day in, day out.

Continue reading

2.0 View of President Obama’s Inaugural Speech

obama-inaugural-word-cloud

A word-cloud produced (quickly) by the Los Angeles Times.  Befiitting the social-media aspect, the paper published it on Twitter immediately; don’t know if it will even be published as a graphic in the day-old “newspaper” printed and distributed tomorrow.  The New York Times, meanwhile, has the same for every previous presidential inaugural address as well – interesting to scroll back and forth to notice trends in presidential intentions.

Which lines was I most struck by? Because of my national-security interests, I was taken by the strong, even muscular statement to terrorist foes: “You cannot outlast us and we will defeat you.”  That followed on his opening with a declarative statement that ““Our nation is at war against a far-reaching network of violence and hatred.”

Information Week has already this afternoon called it the “First Web 2.0 Inauguration,” arguing that “Web 2.0technologies offered plenty of new experiences and communications tools for those witnessing the historic event.”

Some of the best set of mashups using cutting-edge technology, to my mind, are the photographs from media and members of the crowd on the Mall, being synthed into 3D Photosynth virtual models. Really cool!inaugural-photosynth

 

Twitter and other social-media services and channels appeared to hold up well under the crush of traffic. I was pleasantly surprised with the performance of Microsoft’s official streaming of the entire ceremony for the Presidential Inaugural Committee, using Silverlight (same technology was used really nicely for global streaming of the Summer Olympics last year).  In fact, the online streaming was markedly smoother than the ability of the TV networks to speak to reporters reliably down on the Mall – it appeared that network and cellular traffic was constantly cutting out on remote video and microphones.

A moving day, brought to more people than ever before through technology.

 Email this post to a friend

AddThis Social Bookmark Button

Some say Obama has already chosen Cyber Czar

I’ll wade into the breach again, of analyzing (and trying to anticipate) some national-security appointments for the new Obama Administration.  Today I must admit that I’m taken with the latest reportage from the U.K. Spectator – a quite conservative publication not usually known for its closeness to the Obama inner circle.

Continue reading

Several new Microsoft advanced technologies

Fact: As reported in TechCrunch and other sites today, “Microsoft’s Live Labs has just released Thumbtack, a web clipping service that allows users to compile links, media, and text snippets into online storage bins for future reference. Users can also share their Thumbtack collections with their peers, allowing them to collaborate by adding new clips and notations… The service works fine on IE7 and Firefox, and isn’t OS dependent. Each of these clippings can be sorted into folders called ‘Collections’, which can be published to the web via RSS, embedded in blogs, opened to friends for collaboration, or kept private for safe keeping.”  [There’s also a good Ars Technica review of Thumbtack here.]

Continue reading

Bob Gates and the future of defense thinking

Now that Bob Gates is officially going to stay on as Secretary of Defense in the Obama Administration, it’s worthwhile to refresh our understanding of his thinking. Continue reading

%d bloggers like this: