Featured Video
This Week in Quality Digest Live
Innovation Features
Scott Shackelford
An old idea, updated for the 21st century
Dean Solberg
If you can think about it and model it, you can 3D-print it
Jordan Kraemer
Stop doing companies’ digital busywork for free
Jeremy Straub
AI scientists should be free from concern that some AIs might be banned
Knowledge at Wharton
In hindsight we tend to attribute only talent and hard work to successful careers

More Features

Innovation News
125 strategies to achieve maximum confidence, clarity, certainty, and creativity
MIT awards more than $1 million to organizations creating greater economic opportunity for workers
Berkeley Lab and Magic Leap Inc. scientists create widely controllable ultra thin optical components
$79 device delivers dedicated neural network processing to a range of host devices
Drip irrigation targets the plant and not the soil
New approach uses light instead of electricity
If you want to understand a system, try and change it
Components will be designed from the onset to inhabit multiple configurations during service
More than seven billion lives may depend on it

More News

MIT News

Innovation

Faster Page Loads

System allocates data center bandwidth more fairly, so no part of a web page lags behind others

Published: Tuesday, April 25, 2017 - 11:00

(MIT News: Cambridge, MA) -- A web page today is often the sum of many different components. A user’s home page on a social-networking site, for instance, might display the latest posts from the users’ friends; the associated images, links, and comments; notifications of pending messages and comments on the user’s own posts; a list of events; a list of topics currently driving online discussions; a list of games, some of which are flagged to indicate that it’s the user’s turn; and of course the all-important ads, which the site depends on for revenues.

With increasing frequency, each of those components is handled by a different program running on a different server in the website’s data center. That reduces processing time, but it exacerbates another problem: the equitable allocation of network bandwidth among programs.

Many websites aggregate all of a page’s components before shipping them to the user. So if just one program has been allocated too little bandwidth on the data center network, the rest of the page—and the user—could be stuck waiting for its component.

At the Usenix Symposium on Networked Systems Design and Implementation, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory will present a new system for allocating bandwidth in data center networks.

At the Usenix Symposium on Networked Systems Design and Implementation this week, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are presenting a new system for allocating bandwidth in data center networks. In tests, the system maintained the same overall data transmission rate—or network “throughput”—as those currently in use, but it allocated bandwidth much more fairly, completing the download of all of a page’s components up to four times as quickly.

The new system is called Flowtune, and the paper, “Flowtune: Flowlet Control for Datacenter Networks” makes its case.

“There are easy ways to maximize throughput in a way that divides up the resource very unevenly,” says Hari Balakrishnan, one of the senior authors of the paper. “What we have shown is a way to very quickly converge to a good allocation.”

Balakrishnan, the Fujitsu Professor in Electrical Engineering and Computer Science, is joined on the paper by senior author Jonathan Perry, a graduate student in electrical engineering and computer science, and Devavrat Shah, a professor of electrical engineering and computer science.

Central authority

Most networks regulate data traffic using some version of the transmission control protocol. When traffic gets too heavy, some packets of data don’t make it to their destinations. With transmission control protocol, when a sender realizes its packets aren’t getting through, it halves its transmission rate, then slowly ratchets it back up. Given enough time, this procedure will reach an equilibrium point at which network bandwidth is optimally allocated among senders.

But in a big website’s data center, there’s often not enough time. “Things change in the network so quickly that this is inadequate,” Perry says. “Frequently it takes so long that [the transmission rates] never converge, and it’s a lost cause.”

Transmission control protocol gives all responsibility for traffic regulation to the end users because it was designed for the public internet, which links together thousands of smaller, independently owned and operated networks. Centralizing the control of such a sprawling network seemed infeasible, both politically and technically.

But in a data center, which is controlled by a single operator, and with the increases in the speed of both data connections and computer processors in the last decade, centralized regulation has become practical. The CSAIL researchers’ system is a centralized system.

The Flowtune system essentially adopts a market-based solution to bandwidth allocation. Operators assign different values to increases in the transmission rates of data sent by different programs. For instance, doubling the transmission rate of the image at the center of a web page might be worth 50 points, while doubling the transmission rate of analytics data that’s reviewed only once or twice a day might be worth only five points.

Supply and demand

As in any good market, every link in the network sets a “price” according to “demand”—that is, according to the amount of data that senders collectively want to send over it. For every pair of sending and receiving computers, Flowtune then calculates the transmission rate that maximizes total “profit,” or the difference between the value of increased transmission rates—the 50 points for the picture vs. the five for the analytics data—and the price of the requisite bandwidth across all the intervening links.

The maximization of profit, however, changes demand across the links, so Flowtune continually recalculates prices, and on that basis, recalculates maximum profits, assigning the resulting transmission rates to the servers sending data across the network.

The paper also describes a new procedure that the researchers developed for allocating Flowtune’s computations across cores in a multicore computer, to boost efficiency. In experiments, the researchers compared Flowtune to a widely used variation on transmission control protocol, using data from real data centers. Depending on the data set, Flowtune completed the slowest 1 percent of data requests nine to 11 times as rapidly as the existing system.

“Scheduling—and, ultimately, providing guarantees of network performance—in modern data centers is still an open question,” says Rodrigo Fonseca, an assistant professor of computer science at Brown University. “For example, while cloud providers offer guarantees of CPU, memory, and disk, you usually cannot get any guarantees of network performance.”

“Flowtune advances the state of the art in this area by using a central allocator with global knowledge,” Fonseca says. “Centralized solutions are potentially better because of the global view of the network, but it is very challenging to use them at scale, because of the sheer volume of traffic. [There is] too much information to aggregate, process, and distribute for each decision. This work pushes the boundary of what was thought possible with centralized solutions. There are still questions of how much further this can be scaled, but this solution is already usable by many data center operators.”

Discuss

About The Author

MIT News’s picture

MIT News

The MIT News is the Massachusetts Institute of Technology’s (MIT) central hub for news about MIT research, initiatives, and events. It reports MIT news directly and works with journalists around the world to help showcase the achievements of its students, faculty, and staff.