Featured Product
This Week in Quality Digest Live
Operations Features
Huw Thomas
A long-awaited expansion of workers’ rights
Tom Rish
Keep it organized and ready for inspection at any time
MSMEs are encouraged to uphold the highest standards
It’s actually the differences in a twin that are most useful
Del Williams
Preventing damage caused by large, suspended particles

More Features

Operations News
Maximum work envelope in a small footprint
On-demand pipe flow measurement, no process interruptions
Extends focus on data-driven explainability and adds customizability
New capability delivers deeper productivity insights to help manufacturers meet labor challenges
Day and a half workshop to learn, retain, and transfer GD&T knowledge across an organization
Making designs a physical reality with the know-how to make more
Major ERP projects take six months longer than companies were told

More News

Adam Zewe


Keeping Web-Browsing Data Safe From Hackers

Researchers find the root cause of side-channel attacks that are easy to implement but difficult to detect

Published: Tuesday, July 5, 2022 - 12:02

Malicious agents can use machine learning to launch powerful attacks that steal information in ways that are tough to prevent and often even more difficult to study.

Attackers can capture data that “leak” between software programs running on the same computer. They then use machine-learning algorithms to decode those signals, which enables them to obtain passwords or other private information. These are called “side-channel attacks” because information is acquired through a channel not meant for communication.

Researchers at MIT have shown that machine learning-assisted side-channel attacks are both extremely robust and poorly understood. The use of machine-learning algorithms, which are often impossible to fully comprehend due to their complexity, is a particular challenge. In a new paper, the team studied a documented attack that was thought to work by capturing signals leaked when a computer accesses memory. They found that the mechanisms behind this attack were misidentified, which would prevent researchers from crafting effective defenses.

To study the attack, they removed all memory accesses and noticed the attack became even more powerful. Then they searched for sources of information leaks and found that the attack actually monitors events that interrupt a computer’s other processes. They show that an adversary can use this machine learning-assisted attack to exploit a security flaw and determine the website a user is browsing with almost perfect accuracy.

With this knowledge in hand, the team developed two strategies that can thwart this attack.

“The focus of this work is really on the analysis to find the root cause of the problem,” says senior author Mengjia Yan, the Homer A. Burnell Career Development assistant professor of electrical engineering and computer science (EECS) and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. “As researchers, we should really try to delve deeper and do more analysis work, rather than just blindly using black-box machine-learning tactics to demonstrate one attack after another. The lesson we learned is that these machine learning-assisted attacks can be extremely misleading.”

The lead author of the paper is Jack Cook, a recent graduate in computer science. Co-authors include CSAIL graduate student Jules Drean and Jonathan Behrens. The research will be presented at the International Symposium on Computer Architecture.

A side-channel surprise

Cook launched the project while taking Yan’s advanced seminar course. For a class assignment, he tried to replicate a machine learning-assisted side-channel attack from the literature. Past work had concluded that this attack counts how many times the computer accesses memory as it loads a website and then uses machine learning to identify the website. This is known as a website-fingerprinting attack.

He showed that prior work relied on a flawed machine learning-based analysis to incorrectly pinpoint the source of the attack. Machine learning can’t prove causality in these types of attacks, Cook says.

“All I did was remove the memory access, and the attack still worked just as well, or even better,” Cook says. “So, then I wondered, what actually opens up the side channel?”

This led to a research project in which Cook and his collaborators embarked on a careful analysis of the attack. They designed an almost identical attack, but without memory accesses, and studied it in detail.

They found that the attack actually records a computer’s timer values at fixed intervals and uses that information to infer what website is being accessed. Essentially, the attack measures how busy the computer is over time.

A fluctuation in the timer value means the computer is processing a different amount of information in that interval. This is due to system interrupts. A system interrupt occurs when the computer’s processes are interrupted by requests from hardware devices; the computer must pause what it’s doing to handle the new request.

When a website is loading, it sends instructions to a web browser to run scripts, render graphics, load videos, etc. Each of these can trigger many system interrupts.

An attacker monitoring the timer can use machine learning to infer high-level information from these system interrupts to determine what website a user is visiting. This is possible because interrupt activity generated by one website, like CNN.com, is very similar each time it loads, but very different from other websites, like Wikipedia.com, Cook explains.

“One of the really scary things about this attack is that we wrote it in JavaScript, so you don’t have to download or install any code,” he says. “All you have to do is open a website. Someone could embed this into a website and then theoretically be able to snoop on other activity on your computer.”

The attack is extremely successful. For instance, when a computer is running Chrome on the Mac operating system, the attack was able to identify websites with 94-percent accuracy. All commercial browsers and operating systems they tested resulted in an attack with more than 91-percent accuracy.

There are many factors that can affect a computer’s timer, so determining what led to an attack with such high accuracy was akin to finding a needle in a haystack, Cook says. The team ran many controlled experiments, removing one variable at a time, until they realized the signal must be coming for system interrupts, which often can’t be processed separately from the attacker’s code.

Fighting back

Once the researchers understood the attack, they crafted security strategies to prevent it.

First, they created a browser extension that generates frequent interrupts, like pinging random websites to create bursts of activity. The added noise makes it much more difficult for the attacker to decode signals. This dropped the attack’s accuracy from 96 percent to 62 percent, but it slowed the computer’s performance.

For their second countermeasure, they modified the timer to return values that are close to, but not the actual time. This makes it much harder for an attacker to measure the computer’s activity over an interval, Cook explains. This mitigation cut the attack’s accuracy from 96 percent down to just 1 percent.

“I was surprised by how such a small mitigation like adding randomness to the timer could be so effective,” he says. “This mitigation strategy could really be put in use today. It doesn’t affect how you use most websites.”

Building from this work, the researchers plan to develop a systematic analysis framework for machine learning-assisted side-channel attacks. This could help the researchers get to the root cause of more attacks, Yan says. They also want to see how they can use machine learning to discover other types of vulnerabilities.

“This paper presents a new interrupt-based side-channel attack and demonstrates that it can be effectively used for website fingerprinting attacks, while previously, such attacks were believed to be possible due to cache side channels,” says Yanjing Li, assistant professor in the Department of Computer Science at the University of Chicago, who was not involved with this research. “I liked this paper immediately after I first read it, not only because the new attack is interesting and successfully challenges existing notions, but also because it points out a key limitation of ML-assisted side-channel attacks: Blindly relying on machine-learning models without careful analysis cannot provide any understanding on the actual causes or sources of an attack, and can even be misleading. This is very insightful, and I believe will inspire many future works in this direction.”

This research was funded, in part, by the National Science Foundation, the Air Force Office of Scientific Research, and the MIT-IBM Watson AI Lab.

First published June 9, 2022, on MIT News.


About The Author

Adam Zewe’s picture

Adam Zewe

Adam Zewe is a writer for Massachusetts Institute of Technology, covering the electrical engineering and computer science beat in the MIT News Office.