Cerberus [ /’sɜrbʌrʌs/ ] is a deterministic, Byzantine Fault Tolerant (BFT) consensus protocol, which uses state provisioning to enable global consensus between shard groups on the correct ordering of software commands.
The protocol was first described in a March 2020 paper by Florian Cäsar, Dan Hughes, Josh Primero and Stephen Thornton, and has been designed specifically for Radix’s multi-shard architecture.
An evaluation of Cerberus was recently published in the Journal of Systems Research under the title ‘Cerberus: Minimalistic Multi-shard Byzantine-resilient Transaction Processing.’
DEVELOPMENT | |
Whitepaper | |
LEDGER | |
Sybil Protection | Delegated Proof of Stake |
State Model | Sharded (2^256 shards) |
Fault Tolerance | Byzantine Fault Tolerant |
Execution Environment | Radix Engine v2 |
Validator Node Cap | n/a |
History
Cerberus is the sixth iteration of Radix’s core technology and addresses the Weak Atom problem that was present in Tempo.
Overview
Common to all consensus protocols, Cerberus works by solving the state machine replication (SMR) problem, or the fault-tolerant distribution of state across many nodes. The protocol builds on top of Byzantine Fault Tolerance (BFT)-style consensus, maintaining safety under asynchrony, liveness under synchrony, and favoring consistency over availability.
The majority of smart contract platforms today, such as Ethereum, are ‘fully replicated’, which restricts the performance of the overall system to the speed of an individual node. To address this scalability challenge, a number of ‘multi-shard’ consensus protocols have emerged, such as AHL, ByShard, Caper, Chainspace, RingBFT, and SharPer. These process transactions across shards, providing the ability to scale distributed ledgers linearly with the number of shards.
Cerberus was specifically developed to address the limitations of global ordering, which is a typical requirement in distributed ledgers and smart contract virtual machines. Unlike traditional consensus protocols, Cerberus only enforces the ordering of commands for related events. By eliminating the need for global ordering, Cerberus allows for effective state sharding and the parallelization of consensus, leading to increased throughput.
The Cerberus consensus protocol was created to improve the efficiency and performance of multi-shard consensus protocols by minimizing the coordination cost. It is unique for its ability to process many UTXO-like objects concurrently within the same shard without explicit locking or blocking of the shard. Cerberus achieves this with an efficient method for disjointed validator sets to communicate with each other.
At the execution layer, Cerberus splits applications into components that are functionally independent. The Radix Engine is an example of an application layer that is compatible with Cerberus.
Process
Cerberus is an active multi-decree (3-phase commit) consensus mechanism, requiring a supermajority for state commitment and capable of processing transactions in parallel. In contrast, Bitcoin is a single decree mechanism that sacrifices deterministic finality for probabilistic finality with a simpler communication protocol. Bitcoin determines block leadership by hash power, rather than random delegation.
The protocol has three major steps:
- Partition the application layer by formalizing the partial ordering requirements of each command in the application layer.
- Combine multiple "Local Cerberus" BFT instances to form an "emergent Cerberus" instance across shards with analogous properties to a simple traditional BFT.
- Parallelize Emergent Cerberus instances across shards, enabling massive and safe parallelization of the consensus process across the network.
Application Layer
Cerberus includes an application layer that determines how functions within the network operate. The five functions govern the state of a command from inception to commitment:
init
: defines the initial state of the application.partition
: determines which shards a command must be synchronized with to maintain correct partial ordering.map
: collates all intermediate results of a command into an executable.reduce
: shares intermediate results to every shard identified bypartition
.apply
: applies the new shard state.
Local Cerberus
Each Local Cerberus instance operates as a single shard, served by a distinct set of nodes called a shard group. Multiple instances may exist in parallel. Command ordering is split into three phases:
- Prepare
- Pre-Commit
- Commit
Each phase requires a ‘quorum-certificate’ (QC) of agreement to be signed by a majority of the shard group.
Emergent Cerberus
Once committed, related Local Cerberus instances are ‘braided’ into an Emergent Cerberus instance using synchronization primitives. This then behaves as a single cross-shard BFT instance.
The emergent Cerberus also introduces the concept of a ‘leader set’, elected by the relevant shard groups. The leader set proposes ‘merged proposals’, which are voted on to produce QCs.
Emergent Cerberus is designed to handle leader failures, forks, and merges effectively, ensuring safety and liveness in the face of such challenges.
Parallelization
The protocol further allows for the parallelization of emergent Cerberus instances across a fixed set of shards. This parallelization introduces a tradeoff between security and scalability, allowing the system designer to select a minimum threshold of per-shard node coverage that achieves high practical security, while maximizing parallelism and scale.
Research
Variants
Cerberus consists of three variants:
- Core: The base variant, ideal for general use.
- Optimistic: Optimized for scenarios with high trust.
- Resilient: Designed for use in less trustworthy environments.
Performance
Cerberus and its peer consensus protocols were subjected to a variety of rigorous tests, including scenarios with varying numbers of replicas per shard, objects per transaction, and number of shards. Tests also included situations with malicious nodes, gauging how the various protocols would perform under adverse conditions.
In these tests, all three flavors of Cerberus demonstrated superior efficiency, throughput, and latency compared to other state-of-the-art multi-shard consensus protocols. Notably, Cerberus exhibited linear scalability, achieving over one million transactions per second under multiple scenarios.
Analysis
It seems that Cerberus is designed to be highly efficient and secure when implemented in distributed ledger technologies (DLTs) like Radix. With sharding and parallelism in focus, Cerberus is intended to handle high volumes of transactions without compromising the security of each transaction, as each shard manages a subset of the overall state.
The Radix Engine Application Layer, that defines dependencies between commands, works in tandem with the Cerberus-based Radix Ledger. The key units in the Radix network are "atoms" and "particles". Atoms are transactions created by the Radix Engine and consist of a set of discrete updates to state machines referred to as "particles".
The Radix Engine's functionality is similar to a Unspent Transaction Output (UTXO) model used in some blockchains but has been extended for greater flexibility. This essentially means that the state of every coin, or asset, is tracked from creation to current holder, making the complete history of asset ownership verifiable.
Particles in this system each map directly to a shard. When an atom (transaction) takes place, all particles within it (i.e., all state changes) must be either committed or rejected in an atomic way.
This whole mechanism uses the UTXO-based logic provided by the Radix engine to dictate which input/output pairs are valid, and Cerberus enforces the atomic commit of updating the state of multiple particles. This essentially allows multiple state changes to be processed simultaneously, hence increasing throughput and reducing latency in the network.
Furthermore, the complexity of Cerberus relies on the use of a signature aggregation scheme. This scheme, such as BLS signatures, helps to manage the linear message complexity of Cerberus, which is similar to other leader-based Byzantine Fault Tolerant (BFT) protocols. This means that the messaging complexity grows linearly with the number of nodes, allowing the system to maintain efficiency as it scales.
This all means that the Radix network, using the Cerberus protocol, could potentially handle very large transaction volumes while maintaining high security and low latency. This makes it suitable for a wide range of applications that require secure, efficient, and scalable transaction processing.
Radix Implementation
RDX Works, core developers of the Radix network have signaled that a sharded version of Cerberus will be implemented in the Xi’an mainnet upgrade.
Details
Particle Identifiers: Particles are unique units within the Radix Engine. They are identified through a combined hash of the containing atom's identifier and their index within that atom. This scheme creates chains of hashes that are easy to verify and difficult to forge.
Shard Mapping: There is a one-to-one mapping of particles to shards, based on particle identifiers. This method provides good performance and security, however, it can make transaction read queries, such as requesting a complete transaction history for a given account, more complex.
Virtual Single-shard Nodes: To maximize resource usage, each machine can host multiple virtual nodes, with each node behaving as if it is the sole occupant.
Global Shard: There's a concept of a “global shard” where all nodes participate in consensus on special-purpose commands and store the global shard's state. This allows for network-wide functions to be performed, such as maintaining the current set of nodes and managing the state of staking.
Node Sets and Churn: The identity register, which is updated in the global shard, handles the dynamic entry and exit of nodes. The register uses epochs, globally known time periods during which the node set is fixed, to manage this.
Sybil Resistance: Delegated Proof of Stake (DPoS) will be employed to mitigate Sybil attacks. The voting power of each node will be relative to the amount of stake they have registered.
Leader DOS Resistance: Several mitigation strategies are proposed to prevent DOS attacks on future leaders, including the use of sentry nodes, leaderless models, or random leader selection.
Client Command Requests: A reliable mechanism is needed to convey requests to leaders. One common choice is a broad sharing approach where a pool of pending requests is loosely synchronized.
Command Proposal Selection: Leaders randomly select client requests for proposal, with a bias toward the oldest unapproved requests. This helps ensure liveness (ongoing operation of the protocol) but doesn't affect safety (the correctness of the consensus protocol).
These details present a comprehensive plan for the efficient and secure implementation of the Cerberus consensus protocol within the Radix network. By addressing issues such as Sybil attacks, DOS resistance, and maintaining a dynamic node set, the Radix team demonstrates a strong commitment to both performance and security.