Monthly Archives: February 2017

Make speech recognition ubiquitous in electronics

The butt of jokes as little as 10 years ago, automatic speech recognition is now on the verge of becoming people’s chief means of interacting with their principal computing devices.

In anticipation of the age of voice-controlled electronics, MIT researchers have built a low-power chip specialized for automatic speech recognition. Whereas a cellphone running speech-recognition software might require about 1 watt of power, the new chip requires between 0.2 and 10 milliwatts, depending on the number of words it has to recognize.

In a real-world application, that probably translates to a power savings of 90 to 99 percent, which could make voice control practical for relatively simple electronic devices. That includes power-constrained devices that have to harvest energy from their environments or go months between battery charges. Such devices form the technological backbone of what’s called the “internet of things,” or IoT, which refers to the idea that vehicles, appliances, civil-engineering structures, manufacturing equipment, and even livestock will soon have sensors that report information directly to networked servers, aiding with maintenance and the coordination of tasks.

“Speech input will become a natural interface for many wearable applications and intelligent devices,” says Anantha Chandrakasan, the Vannevar Bush Professor of Electrical Engineering and Computer Science at MIT, whose group developed the new chip. “The miniaturization of these devices will require a different interface than touch or keyboard. It will be critical to embed the speech functionality locally to save system energy consumption compared to performing this operation in the cloud.”

“I don’t think that we really developed this technology for a particular application,” adds Michael Price, who led the design of the chip as an MIT graduate student in electrical engineering and computer science and now works for chipmaker Analog Devices. “We have tried to put the infrastructure in place to provide better trade-offs to a system designer than they would have had with previous technology, whether it was software or hardware acceleration.”

Price, Chandrakasan, and Jim Glass, a senior research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory, described the new chip in a paper Price presented last week at the International Solid-State Circuits Conference.

The sleeper wakes

Today, the best-performing speech recognizers are, like many other state-of-the-art artificial-intelligence systems, based on neural networks, virtual networks of simple information processors roughly modeled on the human brain. Much of the new chip’s circuitry is concerned with implementing speech-recognition networks as efficiently as possible.

But even the most power-efficient speech recognition system would quickly drain a device’s battery if it ran without interruption. So the chip also includes a simpler “voice activity detection” circuit that monitors ambient noise to determine whether it might be speech. If the answer is yes, the chip fires up the larger, more complex speech-recognition circuit.

In fact, for experimental purposes, the researchers’ chip had three different voice-activity-detection circuits, with different degrees of complexity and, consequently, different power demands. Which circuit is most power efficient depends on context, but in tests simulating a wide range of conditions, the most complex of the three circuits led to the greatest power savings for the system as a whole. Even though it consumed almost three times as much power as the simplest circuit, it generated far fewer false positives; the simpler circuits often chewed through their energy savings by spuriously activating the rest of the chip.

A typical neural network consists of thousands of processing “nodes” capable of only simple computations but densely connected to each other. In the type of network commonly used for voice recognition, the nodes are arranged into layers. Voice data are fed into the bottom layer of the network, whose nodes process and pass them to the nodes of the next layer, whose nodes process and pass them to the next layer, and so on. The output of the top layer indicates the probability that the voice data represents a particular speech sound.

A voice-recognition network is too big to fit in a chip’s onboard memory, which is a problem because going off-chip for data is much more energy intensive than retrieving it from local stores. So the MIT researchers’ design concentrates on minimizing the amount of data that the chip has to retrieve from off-chip memory.

Human intuition to planning algorithms

Every other year, the International Conference on Automated Planning and Scheduling hosts a competition in which computer systems designed by conference participants try to find the best solution to a planning problem, such as scheduling flights or coordinating tasks for teams of autonomous satellites.

On all but the most straightforward problems, however, even the best planning algorithms still aren’t as effective as human beings with a particular aptitude for problem-solving — such as MIT students.

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory are trying to improve automated planners by giving them the benefit of human intuition. By encoding the strategies of high-performing human planners in a machine-readable form, they were able to improve the performance of competition-winning planning algorithms by 10 to 15 percent on a challenging set of problems.

The researchers are presenting their results this week at the Association for the Advancement of Artificial Intelligence’s annual conference.

“In the lab, in other investigations, we’ve seen that for things like planning and scheduling and optimization, there’s usually a small set of people who are truly outstanding at it,” says Julie Shah, an assistant professor of aeronautics and astronautics at MIT. “Can we take the insights and the high-level strategies from the few people who are truly excellent at it and allow a machine to make use of that to be better at problem-solving than the vast majority of the population?”

The first author on the conference paper is Joseph Kim, a graduate student in aeronautics and astronautics. He’s joined by Shah and Christopher Banks, an undergraduate at Norfolk State University who was a research intern in Shah’s lab in the summer of 2016.

The human factor

Algorithms entered in the automated-planning competition — called the International Planning Competition, or IPC — are given related problems with different degrees of difficulty. The easiest problems require satisfaction of a few rigid constraints: For instance, given a certain number of airports, a certain number of planes, and a certain number of people at each airport with particular destinations, is it possible to plan planes’ flight routes such that all passengers reach their destinations but no plane ever flies empty?

A more complex class of problems — numerical problems — adds some flexible numerical parameters: Can you find a set of flight plans that meets the constraints of the original problem but also minimizes planes’ flight time and fuel consumption?

Finally, the most complex problems — temporal problems — add temporal constraints to the numerical problems: Can you minimize flight time and fuel consumption while also ensuring that planes arrive and depart at specific times?

For each problem, an algorithm has a half-hour to generate a plan. The quality of the plans is measured according to some “cost function,” such as an equation that combines total flight time and total fuel consumption.

Cancer treatment with machine learning

Regina Barzilay is working with MIT students and medical doctors in an ambitious bid to revolutionize cancer care. She is relying on a tool largely unrecognized in the oncology world but deeply familiar to hers: machine learning.

Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science, was diagnosed with breast cancer in 2014. She soon learned that good data about the disease is hard to find. “You are desperate for information — for data,” she says now. “Should I use this drug or that? Is that treatment best? What are the odds of recurrence? Without reliable empirical evidence, your treatment choices become your own best guesses.”

Across different areas of cancer care — be it diagnosis, treatment, or prevention — the data protocol is similar. Doctors start the process by mapping patient information into structured data by hand, and then run basic statistical analyses to identify correlations. The approach is primitive compared with what is possible in computer science today, Barzilay says.

These kinds of delays and lapses (which are not limited to cancer treatment), can really hamper scientific advances, Barzilay says. For example, 1.7 million people are diagnosed with cancer in the U.S. every year, but only about 3 percent enroll in clinical trials, according to the American Society of Clinical Oncology. Current research practice relies exclusively on data drawn from this tiny fraction of patients. “We need treatment insights from the other 97 percent receiving cancer care,” she says.

To be clear: Barzilay isn’t looking to up-end the way current clinical research is conducted. She just believes that doctors and biologists — and patients — could benefit if she and other data scientists lent them a helping hand. Innovation is needed and the tools are there to be used.

Barzilay has struck up new research collaborations, drawn in MIT students, launched projects with doctors at Massachusetts General Hospital, and begun empowering cancer treatment with the machine learning insight that has already transformed so many areas of modern life.

Machine learning, real people

At the MIT Stata Center, Barzilay, a lively presence, interrupts herself mid-sentence, leaps up from her office couch, and runs off to check on her students.

She returns with a laugh. An undergraduate group is assisting Barzilay with a federal grant application, and they’re down to the wire on the submission deadline. The funds, she says, would enable her to pay the students for their time. Like Barzilay, they are doing much of this research for free, because they believe in its power to do good. “In all my years at MIT I have never seen students get so excited about the research and volunteer so much of their time,” Barzilay says.

At the center of Barzilay’s project is machine learning, or algorithms that learn from data and find insights without being explicitly programmed where to look for them. This tool, just like the ones Amazon, Netflix, and other sites use to track and predict your preferences as a consumer, can make short work of gaining insight into massive quantities of data.

Applying it to patient data can offer tremendous assistance to people who, as Barzilay knows well, really need the help. Today, she says, a woman cannot retrieve answers to simple questions such as: What was the disease progression for women in my age range with the same tumor characteristics?

Student innovators and shared his latest projects

Like MIT’s campus computing environment, Athena, a pre-cloud solution for enabling files and applications to follow the user, Dropbox’s Drew Houston ’05 brings his alma mater everywhere he goes.

After earning his bachelor’s in electrical engineering and computer science, Houston’s frustration with the clunky need to carry portable USB drives drove him to partner with a fellow MIT student, Arash Ferdowsi, to develop an online solution — what would become Dropbox.

Dropbox, which now has over 500 million users, continues to adapt. The file-sharing company recently crossed the $1-billion threshold in annual subscription revenue. It’s expanding its business model by selling at the corporate level — employees at companies with Dropbox can use, essentially, one big box.

True to his company’s goal of using technology to bring people (and files) together, Houston is keen to share his own wisdom with others, especially those at MIT. Houston gave the 2013 Commencement address, saying “The hardest-working people don’t work hard because they’re disciplined. They work hard because working on an exciting problem is fun.”

He has also been a guest speaker in ‘The Founder’s Journey,” a course designed to demystify entrepreneurship, and at the MIT Enterprise Forum Cambridge; a frequent and active participant in StartMIT, a workshop on entrepreneurship held over Independent Activities Period (IAP); and a staple of the MIT Better World tour, an alumni engagement event happening at cities all over the globe.

Houston stopped by MIT in February for the latest iteration of StartMIT to give a “fireside chat” about the early days of Dropbox, when it was run with a few of his friends from Course 6, and discussed the current challenges of the company: managing scale. His firm now employs over 1,000 people.

“With thousands of employees in the company, you need coordination, and it can become total chaos. Ultimately all the vectors need to point in the same direction,” he told students. It turns out that Dropbox’s new online collaboration suite, Paper, could play a key role in getting those vectors to line up, offering lessons to both those new to start-ups and more seasoned entrepreneurs.