http://www.nytimes.com/2016/09/20/science/computer-vision-tesla-driverless-cars.html 2016-09-20 10:03:54 A Lesson of Tesla Crashes? Computer Vision Can’t Do It All Yet Self-driving cars come with a caveat now: Keep your hands on the wheel. While computers can recognize images, understanding actions and behaviors are still the next frontier. === Jitendra Malik, a researcher in computer vision for three decades, doesn’t own a “Knowing what I know about computer vision, I wouldn’t take my hands off the steering wheel,” he said. Dr. Malik, a professor at the University of California, Berkeley, was referring to a Federal regulators are still investigating the accident. But it appears likely that the man placed too much confidence in Tesla’s self-driving system. The same may be true of a fatal Tesla accident in China that was Tesla has said that Autopilot is not meant to take over completely for a human driver. And earlier this month, the company implicitly acknowledged that its owners should heed Dr. Malik’s advice, announcing that it was The Tesla accident in May, researchers say, was not a failure of computer vision. But it underscored the limitations of the science in applications like driverless cars despite remarkable progress in recent years, fueled by digital data, computer firepower and software inspired by the human brain. Today, computerized sight can quickly and accurately recognize millions of individual faces, identify the makes and models of thousands of cars, and distinguish cats and dogs of every breed in a way no human being could. Yet the recent advances, while impressive, have been mainly in image recognition. The next frontier, researchers agree, is general visual knowledge — the development of algorithms that can understand not just objects, but also actions and behaviors. Computing intelligence often seems to mimic human intelligence, so computer science understandably invites analogy. In computer vision, researchers offer two analogies to describe the promising paths ahead: a child and the brain. The model borrowed from childhood, many researchers say, involves developing algorithms that learn as a child does, with some supervision but mostly on its own, without relying on vast amounts of hand-labeled training data, which is the current approach. “It’s early days,” Dr. Malik said, “but it’s how we get to the next level.” In computing, the brain has served mainly as an inspirational metaphor rather than an actual road map. Airplanes don’t flap their wings, artificial intelligence experts often say. Machines do it differently than biological systems. But Tomaso Poggio, a scientist at the If successful, the outcome could be a breakthrough in computer vision and machine learning in general, Dr. Poggio said. “To do that,” he added, “you need neuroscience not just as an inspiration, but as a strong light.” The big gains in computer vision owe much to all the web’s raw material: countless millions of online photos used to train the software algorithms to identify images. But collecting and tagging that training data have been a formidable undertaking. ImageNet, a collaborative effort led by researchers at Stanford and Princeton, is one of the most ambitious projects. Initially, nearly one billion images were downloaded. Those were sorted, labeled and winnowed to more than 14 million images in 22,000 categories. The database, for example, includes more than 62,000 images of cats. For a computer-age creation, ImageNet has been strikingly labor intensive. At one point, the sorting and labeling involved nearly 49,000 workers on Vast image databases like ImageNet have been employed to train software that uses neuron-like nodes, known as neural networks. The concept of computing neural networks stretches back more than three decades, but has become a powerful tool only in recent years. “The available data and computational capability finally caught up to these ideas of the past,” said Trevor Darrell, a computer vision expert at the University of California, Berkeley. If data is the fuel, then neural networks constitute the engine of a branch of machine learning called Just how far neural networks can advance computer vision is uncertain. They emulate the brain only in general terms — the software nodes receive digital input and send output to other nodes. Layers upon layers of these nodes make up so-called convolutional neural networks, which, with sufficient training data, have become better and better at identifying images. Fei-Fei Li, the director of Facebook recently encountered the contextual gap. Its algorithm took down the image, posted by a Norwegian author, of a naked, 9-year-old Or take a fluid scene like a dinner party. A person carrying a platter will serve food. A woman raising a fork will stab the lettuce on her plate and put it in her mouth. A water glass teetering on the edge of the table is about to fall, spilling its contents. Predicting what happens next and understanding the physics of everyday life are inherent in human visual intelligence, but beyond the reach of current deep learning technology. At the major annual Recognizing situations enriches computer vision, but the ImSitu project still depends on human-labeled data to train its machine learning algorithms. “And we’re still very, very far from visual intelligence, understanding scenes and actions the way humans do,” Dr. Farhadi said. But for cars that drive themselves safely, several years of continuous improvement — not an A.I. breakthrough — may well be enough, scientists say. It will take not just steady advances in computer vision, they say, but also more high-definition digital mapping and gains in radar and Millions of miles of test driving in varied road and weather conditions, scientists say, should be done before self-driving cars are sold. Google has been testing its vehicles for years, and Uber is beginning a Carmakers around the world are developing self-driving cars, and 2021 seems to be the consensus year for commercial introduction. The German auto company BMW “We’re not there yet, but the pace of improvement is getting us there,” said Gary Bradski, a computer vision scientist who has worked on self-driving vehicles. “We don’t have to wait years and years until some semblance of intelligence arrives, before we have self-driving cars that are safer than human drivers and save thousands of lives.”