## Monthly Archives: June 2017

## Every person of the world who wants to learn something

Raul Boquin, now an MIT senior, remembers the assignment from his freshman year as if it were yesterday. During a leadership workshop, he was asked to write a headline for a newspaper in his imagined future. The words that came to mind resonated so strongly that they now hang on the walls of his dorm room: “Equal opportunities in education for all.”

“I realized that I didn’t come to MIT because it was the best engineering school, but because it was the best place to discover what I was truly passionate about,” he says. “MIT pushed me to my limits and made me able to say ‘I don’t have to be the number one math person, or the number one computer science person, to make a difference’ with the passion I ended up having, which is education.”

Boquin, who is majoring in mathematics with computer science, predicts his life’s work will be to “find a way to adapt education to every person of the world who wants to learn something.”

More to education than teaching

Boquin’s first forays into education followed a relatively traditional path. As part of the undergraduate coursework he needed for his education concentration, he spent time observing teachers in local middle and high schools.

“But at the end of sophomore year, I realized that there was a lot more to education than just teaching.

The summer before his junior year, Boquin worked as a counselor and teaching assistant at Bridge to Enter Advanced Mathematics (BEAM). “It originally started as just a math camp for students in the summer, teaching them things like topology and number theory,” Boquin says. “These were seventh grade Hispanic and black children, and they loved it. And they were amazing at it.”

On a campus in upstate New York, Boquin taught classes by day and talked to students about his own work in mathematics by night. He also designed parts of the BEAM curriculum and came up with fun ways of teaching the lessons. “It was inspiring because it was like I wasn’t only a teacher, but I was a mentor and a friend,” he says.

Back at MIT, with the guidance of Eric Klopfer, professor and director of the Scheller Teacher Education Program and the Education Arcade, Boquin joined lead developer Paul Medlock-Walton to work on Gameblox, through MIT’s Undergraduate Research Opportunities Program (UROP).

Boquin describes Gameblox as a blocks programming language, in which users put blocks together to make something happen in the program. He worked on the user interface of the program, wrote tutorials for features, and built a framework for other researchers to test new code and features. His favorite part, though, was working on a Gameblox curriculum.

“I researched ways of finding out how teachers could use Gameblox to teach not only math and science, but also English, and history, and geography, and how to incorporate programming concepts in different levels of education,” Boquin says. “The features that I got to add to Gameblox as an engineer, I got to test, live, right afterward with teachers from Boston, or with students.”

## Security protocols to thwart hijacked servers

A reaction to the 2008 financial crisis, Bitcoin is a digital-currency scheme designed to wrest control of the monetary system from central banks. With Bitcoin, anyone can mint money, provided he or she can complete a complex computation quickly enough. Through a set of clever protocols, that computational hurdle prevents the system from being coopted by malicious hackers.

At the IEEE Symposium on Security and Privacy this week, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory are presenting a new system that uses Bitcoin’s security machinery to defend against online identity theft.

“Our paper is about using Bitcoin to prevent online services from getting away with lying,” says Alin Tomescu, a graduate student in electrical engineering and computer science and first author on the paper. “When you build systems that are distributed and send each other digital signatures, for instance, those systems can be compromised, and they can lie. They can say one thing to one person and one thing to another. And we want to prevent that.”

An attacker who hacked a public-key encryption system, for instance, might “certify” — or cryptographically assert the validity of — a false encryption key, to trick users into revealing secret information. But it couldn’t also decertify the true key without setting off alarms, so there would be two keys in circulation bearing certification from the same authority. The new system, which Tomescu developed together with his thesis advisor, Srini Devadas, the Edwin Sibley Webster Professor of Electrical Engineering and Computer Science at MIT, defends against such “equivocation.”

Because Bitcoin is completely decentralized, the only thing ensuring its reliability is a massive public log — referred to as the blockchain — of every Bitcoin transaction conducted since the system was first introduced in 2009. Earlier systems have used the Bitcoin machinery to guard against equivocation, but for verification, they required the download of the entire blockchain, which is 110 gigabytes and growing hourly. Tomescu and Devadas’ system, by contrast, requires the download of only about 40 megabytes of data, so it could run on a smartphone.

Striking paydirt

Extending the blockchain is integral to the process of minting — or in Bitcoin terminology, “mining” — new bitcoins. The mining process is built around a mathematical function, called a one-way hash function, that takes three inputs: the last log entry in the blockchain; a new blockchain entry, in which the miner awards him- or herself a fixed number of new bitcoins (currently 12.5); and an integer. The output of the function is a string of 1s and 0s.

## The inner workings of neural networks

Neural networks, which learn to perform computational tasks by analyzing large sets of training data, are responsible for today’s best-performing artificial intelligence systems, from speech recognition systems, to automatic translators, to self-driving cars.

But neural nets are black boxes. Once they’ve been trained, even their designers rarely have any idea what they’re doing — what data elements they’re processing and how.

Two years ago, a team of computer-vision researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) described a method for peering into the black box of a neural net trained to identify visual scenes. The method provided some interesting insights, but it required data to be sent to human reviewers recruited through Amazon’s Mechanical Turk crowdsourcing service.

At this year’s Computer Vision and Pattern Recognition conference, CSAIL researchers will present a fully automated version of the same system. Where the previous paper reported the analysis of one type of neural network trained to perform one task, the new paper reports the analysis of four types of neural networks trained to perform more than 20 tasks, including recognizing scenes and objects, colorizing grey images, and solving puzzles. Some of the new networks are so large that analyzing any one of them would have been cost-prohibitive under the old method.

The researchers also conducted several sets of experiments on their networks that not only shed light on the nature of several computer-vision and computational-photography algorithms, but could also provide some evidence about the organization of the human brain.

Neural networks are so called because they loosely resemble the human nervous system, with large numbers of fairly simple but densely connected information-processing “nodes.” Like neurons, a neural net’s nodes receive information signals from their neighbors and then either “fire” — emitting their own signals — or don’t. And as with neurons, the strength of a node’s firing response can vary.

In both the new paper and the earlier one, the MIT researchers doctored neural networks trained to perform computer vision tasks so that they disclosed the strength with which individual nodes fired in response to different input images. Then they selected the 10 input images that provoked the strongest response from each node.

In the earlier paper, the researchers sent the images to workers recruited through Mechanical Turk, who were asked to identify what the images had in common. In the new paper, they use a computer system instead.

## Paper folding patterns to produce any 3 D structure.

In a 1999 paper, Erik Demaine — now an MIT professor of electrical engineering and computer science, but then an 18-year-old PhD student at the University of Waterloo, in Canada — described an algorithm that could determine how to fold a piece of paper into any conceivable 3-D shape.

It was a milestone paper in the field of computational origami, but the algorithm didn’t yield very practical folding patterns. Essentially, it took a very long strip of paper and wound it into the desired shape. The resulting structures tended to have lots of seams where the strip doubled back on itself, so they weren’t very sturdy.

At the Symposium on Computational Geometry in July, Demaine and Tomohiro Tachi of the University of Tokyo will announce the completion of a quest that began with that 1999 paper: a universal algorithm for folding origami shapes that guarantees a minimum number of seams.

“In 1999, we proved that you could fold any polyhedron, but the way that we showed how to do it was very inefficient,” Demaine says. “It’s efficient if your initial piece of paper is super-long and skinny. But if you were going to start with a square piece of paper, then that old method would basically fold the square paper down to a thin strip, wasting almost all the material. The new result promises to be much more efficient. It’s a totally different strategy for thinking about how to make a polyhedron.”

Demaine and Tachi are also working to implement the algorithm in a new version of Origamizer, the free software for generating origami crease patterns whose first version Tachi released in 2008.

Maintaining boundaries

The researchers’ algorithm designs crease patterns for producing any polyhedron — that is, a 3-D surface made up of many flat facets. Computer graphics software, for instance, models 3-D objects as polyhedra consisting of many tiny triangles. “Any curved shape you could approximate with lots of little flat sides,” Demaine explains.

Technically speaking, the guarantee that the folding will involve the minimum number of seams means that it preserves the “boundaries” of the original piece of paper. Suppose, for instance, that you have a circular piece of paper and want to fold it into a cup. Leaving a smaller circle at the center of the piece of paper flat, you could bunch the sides together in a pleated pattern; in fact, some water-cooler cups are manufactured on this exact design.

In this case, the boundary of the cup — its rim — is the same as that of the unfolded circle — its outer edge. The same would not be true with the folding produced by Demaine and his colleagues’ earlier algorithm. There, the cup would consist of a thin strip of paper wrapped round and round in a coil — and it probably wouldn’t hold water.

“The new algorithm is supposed to give you much better, more practical foldings,” Demaine says. “We don’t know how to quantify that mathematically, exactly, other than it seems to work much better in practice. But we do have one mathematical property that nicely distinguishes the two methods. The new method keeps the boundary of the original piece of paper on the boundary of the surface you’re trying to make. We call this watertightness.”

A closed surface — such as a sphere — doesn’t have a boundary, so an origami approximation of it will require a seam where boundaries meet. But “the user gets to choose where to put that boundary,” Demaine says. “You can’t get an entire closed surface to be watertight, because the boundary has to be somewhere, but you get to choose where that is.”

## Decisions when outcomes are uncertain

Markov decision processes are mathematical models used to determine the best courses of action when both current circumstances and future consequences are uncertain. They’ve had a huge range of applications — in natural-resource management, manufacturing, operations management, robot control, finance, epidemiology, scientific-experiment design, and tennis strategy, just to name a few.

But analyses involving Markov decision processes (MDPs) usually make some simplifying assumptions. In an MDP, a given decision doesn’t always yield a predictable result; it could yield a range of possible results. And each of those results has a different “value,” meaning the chance that it will lead, ultimately, to a desirable outcome.

Characterizing the value of given decision requires collection of empirical data, which can be prohibitively time consuming, so analysts usually just make educated guesses. That means, however, that the MDP analysis doesn’t guarantee the best decision in all cases.

In the *Proceedings of the Conference on Neural Information Processing Systems*, published last month, researchers from MIT and Duke University took a step toward putting MDP analysis on more secure footing. They show that, by adopting a simple trick long known in statistics but little applied in machine learning, it’s possible to accurately characterize the value of a given decision while collecting much less empirical data than had previously seemed necessary.

In their paper, the researchers described a simple example in which the standard approach to characterizing probabilities would require the same decision to be performed almost 4 million times in order to yield a reliable value estimate.

With the researchers’ approach, it would need to be run 167,000 times. That’s still a big number — except, perhaps, in the context of a server farm processing millions of web clicks per second, where MDP analysis could help allocate computational resources. In other contexts, the work at least represents a big step in the right direction.

“People are not going to start using something that is so sample-intensive right now,” says Jason Pazis, a postdoc at the MIT Laboratory for Information and Decision Systems and first author on the new paper. “We’ve shown one way to bring the sample complexity down. And hopefully, it’s orthogonal to many other ways, so we can combine them.”

Unpredictable outcomes

In their paper, the researchers also report running simulations of a robot exploring its environment, in which their approach yielded consistently better results than the existing approach, even with more reasonable sample sizes — nine and 105. Pazis emphasizes, however, that the paper’s theoretical results bear only on the number of samples required to estimate values; they don’t prove anything about the relative performance of different algorithms at low sample sizes.

Pazis is joined on the paper by Jonathan How, the Richard Cockburn Maclaurin Professor of Aeronautics and Astronautics at MIT, and by Ronald Parr, a professor of computer science at Duke.