Reconfiguring Data Centers

Summary

The goal of this project is to design and implement reconfigurable distributed systems to cope with their inherent dynamism and elasticity requirements at large-scale.

Supervisor(s)

Dr Vincent Gramoli

Research Location

Information Technologies

Program Type

PHD

Synopsis

Today's datacenters must provide elasticity (the ability for a distributed machines to expand and shrink its resources according to the user demand) and fault-tolerance (the ability to hide the software or hardware failure of a node by replacing it with another). A common solution to these two problems is called reconfiguration, the action of replacing the configuration currently used by clients of the distributed system by a new configuration of more/less nodes that are all non-faulty. The crux is to make sure that the service provided by the distributed system to the clients is not disrupted during a reconfiguration.

The goal of this research project in collaboration with NICTA is to implement a reconfigurable distributed system that will be used at the core of a datacenter. It will exploit reconfigurable quorum systems, where quorums (mutually intersecting sets of servers) are replaced by new ones excluding servers that are unreliable or under-utilized. A consensus protocol should be designed and implemented to totally order the set of consecutive configuration installed in the system. A well-known algorithm used by Microsoft and Google is called Paxos and relies on a leader election that terminates when messages get delivered. While the use of such a leader is appealing to reconfigure in a simple way, it unfortunately limits availability as noted in the open source Yahoo! Zookeeper system. By contrast, reconfiguration requests should be directly routed towards a distributed set of quorum nodes to remedy this issue. Zookeeper could be used as a representative testbed of applications running on Twitter, Facebook, Netflix and Linked In's datacenters to validate the solution.

Want to find out more?

Contact us to find out what’s involved in applying for a PhD. Domestic students and International students

Contact Research Expert to find out more about participating in this opportunity.

Browse for other opportunities within the Information Technologies .

Keywords

Cloud, Data Center, Distributed System, Upgrade, Update, Elastic, Failure, software engineering

Opportunity ID

The opportunity ID for this research opportunity is: 1791

Other opportunities with Dr Vincent Gramoli