Full TitleAutonomic Byzantine Fault-Tolerant Systems
This project aims at designing, implementing, and evaluating techniques to build autonomic Byzantine fault tolerant systems. This topic is of utmost relevance. Society depends more and more of internet based services that must be
protected from external attacks. There is anecdotical evidence that even non-critical systems, such as gaming platforms, may suffer severe attacks for almost frivolous reasons. For instance, in the recent Xmas of 2014, Playstation Online was unavailable for several days due to a denial of service which may have been associated with the release of a film (The Interview). Often, these attacks have severe consequences, such as leaking of confidential information, such as credit card information, clinical data, etc. Byzantine fault-tolerance can play a critical role in making these systems robust and adaptation is key to maintain the performance, respond to changes in the threat level, and to make attacks hardar (moving target defence). This project aims at contributing in this direction, by taking the following steps:
- One needs to study the techniques that can help in defining the best adaptation plan for each scenario. As listed before, different approaches have been experimented in systems that are non-Byzantine tolerant, such as machine learning, analytical modeling, runtime selection of adaptations, and static user-defined policies. The research group already has experience in using these techniques for crash failure models. Many of these techniques implement eager adaptation policies, i.e. they select the adaptations that provide the largest immediate benefit. The use of planning techniques that may help in avoiding local minima in the context of dynamic adaptation of software systems is another research line that we aim at exploring in this project. Therefore, in what regards the policies that drive the adaptation, we will study the pros-and-cons of each of these techniques in the context of BFT systems.
- We will design and implement a robust adaptation manager that will select when and how to adapt based on information collected from the environment and on a set of policies that capture the high levels goals that the system must satisfy. In this context one needs to study which information about the workloads and the operational envelope needs to be collected to guide the system adaptation. Typical parameters that have been used in different autonomic systems are also relevant in the BFT case. These include information about the resources of the computing nodes (CPU, memory, disk bandwidth, etc) and of the network resources (latency, throughput, message losses, network partitions). In our previous work we have designed adaptation managers that can be used as starting point to achieve this goal. Furthermore, one needs to take also into consideration that some of these parameters may be intentionally disturbed by an adversary, with the aim of inducing undesirable adaptations. Additionally, novel sensors that can detect threats or on-going attacks should also be used to trigger dynamic changes to preserve the resilience of the system; these sensors can be built using existing Distributed Intrusion Detection Systems.
- One needs to design BFT protocols that are adaptable, i.e., that can change their configuration, parameters, and even algorithms in runtime in a secure manner. This involves using modular designs in the implementation of the protocols, such that it is possible to change only a subset of the protocol behaviors if needed. The group has a strong expertise in the design of adaptable protocols in general (including the implementation of a framework for protocol adaptation) and also concrete expertise in designing modular designs for Byzantine protocols.
- Finally, we plan to build an integrated prototype of the complete system, including the autonomic manager and a distributed Byzantine storage system based on BFT protocols. The prototype will be used to evaluate the solution in both laboratorial and realistic conditions. The clusters of both groups will be used to perform micro-benchmarks under controlled conditions but a deployment using cloud-based infrastructures is also planned. Also, the prototype will be integrated in a real use case, namely to build an adaptive Byzantine fault tolerant version of an authentication service for secure handheld devices. A version of this service, without support for adaptation or Byzantine fault tolerance, is being developed by an European consortium, as part of an ecosystem that will support industrial applications in the areas of e-health and e-commerce.
In summary, the major research results of the project will be:
- The investigation and design of policies for autonomic BFT systems.
- The implementation of a robust autonomic manager for BFT systems.
- The investigation and design of adaptable BFT protocols.
- The implementation and evaluation of a prototype of the resulting system.