Traffic management solutions for data center networks

The transmission control protocol, or TCP, which manages traffic on the internet, was first proposed in 1974. Some version of TCP still regulates data transfer in most major data centers, the huge warehouses of servers maintained by popular websites.

That’s not because TCP is perfect or because computer scientists have had trouble coming up with possible alternatives; it’s because those alternatives are too hard to test. The routers in data center networks have their traffic management protocols hardwired into them. Testing a new protocol means replacing the existing network hardware with either reconfigurable chips, which are labor-intensive to program, or software-controlled routers, which are so slow that they render large-scale testing impractical.

At the Usenix Symposium on Networked Systems Design and Implementation later this month, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory will present a system for testing new traffic management protocols that requires no alteration to network hardware but still works at realistic speeds — 20 times as fast as networks of software-controlled routers.

The system maintains a compact, efficient computational model of a network running the new protocol, with virtual data packets that bounce around among virtual routers. On the basis of the model, it schedules transmissions on the real network to produce the same traffic patterns. Researchers could thus run real web applications on the network servers and get an accurate sense of how the new protocol would affect their performance.

“The way it works is, when an endpoint wants to send a [data] packet, it first sends a request to this centralized emulator,” says Amy Ousterhout, a graduate student in electrical engineering and computer science (EECS) and first author on the new paper. “The emulator emulates in software the scheme that you want to experiment with in your network. Then it tells the endpoint when to send the packet so that it will arrive at its destination as though it had traversed a network running the programmed scheme.”

Ousterhout is joined on the paper by her advisor, Hari Balakrishnan, the Fujitsu Professor in Electrical Engineering and Computer Science; Jonathan Perry, a graduate student in EECS; and Petr Lapukhov of Facebook.

Traffic control

Each packet of data sent over a computer network has two parts: the header and the payload. The payload contains the data the recipient is interested in — image data, audio data, text data, and so on. The header contains the sender’s address, the recipient’s address, and other information that routers and end users can use to manage transmissions.

When multiple packets reach a router at the same time, they’re put into a queue and processed sequentially. With TCP, if the queue gets too long, subsequent packets are simply dropped; they never reach their recipients. When a sending computer realizes that its packets are being dropped, it cuts its transmission rate in half, then slowly ratchets it back up.