Like many people, I was very impressed by a video over the weekend of the Word Lens real-time translation app for iPhone. It struck with a viral bang, and within a few days racked up over 2 million YouTube views. What particularly made me smile was digging backwards through the twitter stream of a key Word Lens developer whom I follow, John DeWeese, and finding this pearl of a tweet (right) from several months ago, as he was banging out the app out in my old stomping grounds of the San Francisco Bay Area. That’s a hacker mentality for you :)
But one thought I had in watching the video was, why do I need to be holding the little device in front of me, to get the benefit of its computational resources and display? I’ve seen the studies and predictions that “everything’s going mobile,” but I believe that’s taking too literally the device itself, the form-factor of a little handheld box of magic.
I actually see a slightly different future path, one in which we’ll take advantage of computing resources and digital services all around us, in a supercharged immersive environment of virtual computation. I don’t mean Second Life, where you have to log-in to a virtual world divorced from your real environs; and I don’t mean a world like Tron (I haven’t seen the sequel yet, but will; the first one came out when I was an excitable college senior looking to the future). Instead, we are rapidly integrating immersive computation all around us, in our everyday world, and fairly soon we won’t need to pull out a smartphone to see it.
Here’s what I mean: I’m pretty sure this is exactly what everyone has always thought “playing air guitar” should be like:
This work, by London-based developer Chris O’Shea, represents yet another private-hack application using Microsoft Kinect as the platform for sensing, ranging, and interacting with 3D virtual environments.
Communication and Collaboration
Did I say interacting “with” virtual environments? How about interacting “in” them, as in this stunning example which highlights the potential for uses in professional collaboration in immersive spaces, along with virtual-object-manipulation. Unlike Second Life, you don’t control an avatar with a keyboard or mouse – you essentially are the avatar (hmm, there’s a movie-idea in that). Also note at the beginning the funny hat-tip to the classic movie Office Space – apparently even Martian cubicle-workers have to deal with the iconic TPS reports:
There has been an explosion in Kinect-hack activity in the past month, an “OpenKinect” brushfire, with individual developers doing work on Microsoft platforms and on a host of others, like libfreenect/python, openCV, OpenFrameworks, Apple’s OS X and even iPad.
One of my favorite examples so far is work at the Georgia Tech College of Computing, where researchers have built the Kinect American Sign Language Recognizer, with impressive lab results so far using Kinect with gesture-recognition driving Hidden Markov Models (HMM). One huge human-level advance is that in previous automated examples using earlier pre-Kinect technology, the deaf children in the study had to wear cumbersome, unnatural headgear and wrist-mounted 3D accelerometers. That requirement is gone now, as you can see in the research video here.
Other inventive examples include controlling a hands-free web browser, an artful “invisibility cloak” use, and nifty ideas highlighted in Kinect Hack Prize Competitions and this combo-video of “12 Best Kinect Hacks.”
The company’s position on these hacks has been “cautiously supportive,” one might say:
“Kinect was not actually hacked,” said Microsoft program manager Alex Kipman, speaking on NPR’s Science Friday with Ira Flato last week. “Hacking would mean that someone got to our algorithms, that sit inside of the Xbox, and was able to actually use them, which hasn’t happened. Or it means that you put a device between the sensor and the Xbox for means of cheating, which also has not happened. That’s what we call hacking, and that’s why we’ve put a ton of effort to make sure it doesn’t actually occur.”
“What has happened,” continued Kipman, “is someone wrote an open source driver for PCs that essentially opens the USB connection, which we didn’t protect by design, and reads the inputs from the sensor. The sensor, again as I talked earlier, has eyes and ears, and that’s a whole bunch of noise that someone needs to take and turn into signal.”
Microsoft Game Studios manager Shannon Loftis weighed in as well, noting that “as an experienced creator, I’m very excited to see that people are so inspired that it was less than a week after the Kinect came out before they had started creating and thinking about what they could do.”
“So no one’s going to get in trouble?” asked Flato.
“Nope, absolutely not,” replied Kipman.
– PC World, 11/22/2010
Uses in and for Government
These hacks demonstrate that every so often computing takes a “fun” turn again, delighting and enticing a new generation of whiz-kid programmers and developers. As you might expect, there’s also been activity among developers in the government space, and I’m fascinated by the possibilities already emerging on white-boards or dev-laptops in areas like education and government training, health-care applications, and easier and more user-friendly government/citizen interactivity in general. My friend Chris Niehaus, Director of Microsoft’s Director of U.S. Public Sector Innovation, has explored many of the state-of-the-art possibilities and social implications in health and medical care, and O’Reilly Radar’s Alex Howard has written about other non-gaming areas where Kinect immersion could take hold.
National-security areas of interest are already being explored in robotics, ISR (intelligence, surveillance, reconnaissance), and military command-and-control apps. Here’s a public example I can share: folks at UC-Berkeley’s EE/CS Department, in the Hybrid Systems Lab STARMAC Project, have experimentally hacked up a Quadrotor UAV with an onboard Kinect Sensor, to demonstrate the off-the-shelf quality of environment-sensing and remote control:
The Road Ahead
The future will definitely feature incredibly powerful government uses (alongside commercial uses) of innovative human-computer-interaction (HCI) and natural user interfaces (NUI). And the v. 1.0 Kinect will eventually be surpassed by improved Microsoft iterations, and likely compete in a healthy market of alternative hardware enabling depth-sensing and touch-free interactivity.
As that happens, we’ll see and use Air Everything – or almost everything – and we’ll like it, and then in many circumstances we’ll forget we’re even using room-based or location-embedded computing resources.
Here’s an analogy from a century ago: In the early days of home-delivered electricity, there was enormous awareness of – and fear of – the power coursing through wires behind floorboards and across walls. The early electrical plug-in outlets were frightening objects and the source of great parental anxiety and dangerous childhood experimentation. But eventually, with the new technology integrated into home design, with buried power lines and power-grids incorporated into urban architecture, we lost sight (literally) of the electricity being transmitted all around us. Now, things just “turn on.” NUI will become that carefree.
To keep up with the pace of activity around the Kinect platform, check these sources periodically: KinectHacks.net and its frequent updates; and the YouTube channel of UC-Davis computer science professor and Kinect sensei Oliver Kreylos, whose videos are a mix of eye-popping functionality and behind-the-scenes programming explanations.
Filed under: Government, innovation, Microsoft, R&D, Society, Technology Tagged: | Alex Howard, Apple, AR, ASL, augmented reality, Chris Niehaus, hack, hacks, HCI, ISR, IT, John DeWeese, Kinect, Kinecthacks, Microsoft, mobile, NUI, Oliver Kreylos, openkinect, OSX, research, robotics, ron, Second Life, sign language, SL, STARMAC, tech, Technology, virtual, virtual reality, VR, Word Lens, youTube