Cerberus (Consensus Protocol)

Cerberus [ /’sɜrbʌrʌs/ ] is a family of deterministic, Byzantine Fault Tolerant (BFT) consensus protocols that use state provisioning to enable global consensus between shard groups on the correct ordering of software events.

History
Overview
Mechanism
1. Core Cerberus
2. Optimistic Cerberus
3. Pessimistic Cerberus
Performance
Other implementations of Cerberus
Resources

	DEVELOPMENT
Whitepapers	2020 Cerberus Consensus Protocol 2023 Cerberus Consensus Protocol
Peer Review	JSys
	LEDGER
Sybil Protection	Delegated Proof of Stake
State Model	Sharded
Fault Tolerance	Byzantine Fault Tolerant
Execution Environment	Radix Engine v2
Validator Node Cap	n/a

History

The Cerberus protocol was first described in a March 2020 paper by Florian Cäsar, Dan Hughes, Josh Primero and Stephen Thornton, and has been designed specifically for Radix’s multi-shard architecture. Cerberus represents the sixth iteration of the core technology underlying Radix. It addresses the "Weak Atom" problem that existed in the previous version of Radix's protocol, Tempo.

An evaluation of Cerberus was published in June 2023 in the Journal of Systems Research under the title ‘Cerberus: Minimalistic Multi-shard Byzantine-resilient Transaction Processing.’ The authors’ analysis highlighted the ability of Cerberus to minimize coordination costs for multi-shard consensus.

RDX Works - core developers of the Radix network - have signaled that a sharded version of Cerberus will be implemented in the Xi’an mainnet upgrade.

Overview

Cerberus is a consensus protocol optimized for Radix's multi-shard architecture. It aims to enable scalability, security, and decentralization.

Scalability is attained by sharding which allows parallel transaction processing across multiple shards. The cross-shard communication mechanism coordinates shards efficiently, providing linear scalability as more shards are added.
Security against Byzantine failures up to 1/3 of nodes per shard is ensured through BFT consensus within each shard. This provides resilience even under adverse conditions.
Decentralization is supported through permissionless participation of validators. The system remains secure as long as no single entity controls over 1/3 of validators in any one shard.

Cerberus partitions transaction processing and state management into groups called shards. Each shard runs a local Byzantine Fault Tolerant (BFT) consensus algorithm to order transactions and maintain local state. This sharding technique is necessary to overcome the scalability limitations of traditional fully replicated networks in which every node must process every transaction. However, multi-shard transactions that span across shards require atomic commitment protocols to maintain consistency.

A key innovation of Cerberus is its cross-shard communication mechanism that coordinates multi-shard transactions with low overhead. For this, Cerberus utilizes UTXO-based sharding. In the UTXO model, data must be destroyed when it is modified and recreated in a new state. This prevents double spending across shards, avoiding the need for global transaction ordering. By leveraging UTXO transaction semantics and transaction atomicity, Cerberus minimizes the coordination required for multi-shard commitment, making cross-shard ordering unnecessary. A key capability of Cerberus is the ability to process UTXO transactions concurrently within the same shard without explicit locking or blocking. This is accomplished through an efficient inter-shard communication method between disjoint validator sets. This allows Cerberus to reduce communication and computation costs compared to existing protocols such as AHL, ByShard, Caper, Chainspace, RingBFT, and SharPer. Parallel processing across shards enables linear scalability as shards increase.

At the execution layer, applications can be split by the Radix Engine into independent components compatible with Cerberus sharding.

Mechanism

Cerberus is an active multi-decree (3-phase commit) consensus mechanism, requiring a supermajority for state commitment and capable of processing transactions in parallel. By way of comparison, Bitcoin is a single decree mechanism that sacrifices deterministic finality for probabilistic finality with a simpler communication protocol.

Cerberus consists of three variants:

Core: The base variant, ideal for general use.
Optimistic: Optimized for scenarios with high trust.
Resilient or Pessimistic: Designed for use in less trustworthy environments.

1. Core Cerberus

Core Cerberus (CCerberus) is designed to take maximum advantage of the UTXO transaction properties to minimize coordination costs. CCerberus operates under the assumption that all clients are well-behaved and will never approve conflicting transactions.

The key insight in CCerberus is that the order of transaction execution does not matter for correctness as long as the atomicity of each transaction is ensured. CCerberus enforces transaction atomicity using a three-step approach:

Local inputs: Each shard checks if it has the necessary local inputs for a transaction.

When a client sends a transaction request to a shard, the shard first checks if it has all the necessary input objects specified in the transaction.
The shard uses a consensus protocol to create an ordered log of transactions. Consensus is necessary to get consistent results on object availability across replicas.
If the shard has all the specified input objects available, it adds the transaction to its local ordered log with a "pledge" to use those inputs if needed.

Cross-shard exchange: The shards exchange availability of local inputs via a single communication round.

The shard that received the client's transaction request broadcasts the transaction with its local input object status to all other shards involved in the transaction.
Cross-shard broadcast uses a "cluster sending" primitive that prevents equivocation and ensures reliable delivery.
The shard waits to receive input object status from all other relevant shards before proceeding.

Decide outcome: If all shards have indicated input availability, the transaction is committed, else aborted.

If all shards reported input object availability, the transaction is committed by constructing the output objects. Else it is aborted.
Shards defer the actual commit/abort until the consensus sequence order allows it.
A single consensus step within each shard is sufficient to totally order the commit or abort of the transaction.
Clients are notified of the commit or abort outcome from enough replicas to withstand Byzantine faults.

No coordination is required between shards to agree on transaction ordering. Each shard processes transactions independently as long as transaction atomicity is maintained.

The key to CCerberus's minimal coordination is relying on the transaction atomicity provided by the UTXO model. Safety is guaranteed for well-behaved clients even without inter-shard coordination on ordering. No cross-shard synchronization is needed on when to execute transactions. Each shard processes transactions independently and coordinates only to atomically commit transactions.

The cross-shard exchange is performed using a cluster sending primitive that prevents equivocation. The decision to commit or abort a transaction requires just a single consensus step in each involved shard. No coordination is required between shards to agree on transaction ordering.

This minimalist design allows CCerberus to scale linearly with low latency as the number of shards increases. However, CCerberus provides no liveness or safety guarantees for transactions issued by malicious clients. The simplicity of CCerberus highlights the benefits of specializing distributed ledger designs for UTXO workloads.

During testing, Core Cerberus was able to achieve extremely high throughput in the order of millions of transactions per second - 5-12x higher throughput than prior generalized techniques. This demonstrates the linear scalability potential of Cerberus' approach to cross-shard coordination.

2. Optimistic Cerberus

Optimistic Cerberus (OCerberus) aims to lift the assumption of only well-behaved clients made by CCerberus. OCerberus targets environments where faulty conditions are rare and optimizes for the normal case.

The key change in OCerberus is integrating the detection of conflicts due to concurrent transactions into a single multi-shard consensus step. This is done by having a shard only participate in the consensus if it receives evidence that other relevant shards are also participating.

In the optimistic case with no conflicts, OCerberus processes transactions using minimal coordination similar to CCerberus. Safety is still guaranteed even under malicious actions. However, progress can be disrupted by coordinated attacks that prevent consensus.

OCerberus incorporates intricate recovery mechanisms to handle such malicious scenarios. Both local view change per shard and a novel global view change process across shards are defined. Overall, OCerberus reduces the latency compared to CCerberus for common transactions by combining cross-shard communication within consensus. But it has higher costs for recovery from faulty conditions.

3. Pessimistic Cerberus

Pessimistic / Resilient Cerberus (PCerberus) takes a pessimistic approach and anticipates malicious behavior by adding coordination to the normal case operations. The goal of PCerberus is to handle faulty conditions including transactions from malicious clients gracefully without complicating the recovery process.

PCerberus enhances CCerberus with two key changes:

Objects are destructed after the local consensus step and before cross-shard exchange. This prevents equivocation and firmly locks objects to a transaction.
A second local consensus step is added after cross-shard exchange to totally order the commit operations in each shard.

By destructing objects early, safety is ensured even under concurrent transactions created by faulty clients. The second consensus step simplifies recovery by relegating conflict resolution to the local shard level. No intricate cross-shard coordination is needed during view change.

Overall, PCerberus adds more coordination in the common case compared to CCerberus and OCerberus but enables simpler reasoning about safety and liveness even under attack. The extra coordination phases trade off some efficiency for resilience. PCerberus highlights the spectrum of design choices possible when co-optimizing sharding techniques with application semantics.

Performance

Cerberus and its peer consensus protocols were subjected to a variety of rigorous tests, including scenarios with varying numbers of replicas per shard, objects per transaction, and number of shards. Tests also included situations with malicious nodes, gauging how the various protocols would perform under adverse conditions.

In these tests, all three flavors of Cerberus demonstrated superior efficiency, throughput, and latency compared to other state-of-the-art multi-shard consensus protocols. Notably, Cerberus exhibited linear scalability, achieving over ten million transactions per second on 1024 shards.

Throughput of three Cerberus protocols as a function of the number of shards.

Other implementations of Cerberus

In 2022, the Tari project announced a pivot from its original layer 2 data availability network (DAN) to the Cerberus consensus algorithm. The new DAN based on Cerberus will integrate with Tari's existing Mimblewimble layer 1 to provide a network with advantages over pure proof-of-stake systems.

Resources

Cerberus Infographic Series - Chapter I | The Radix Blog | Radix DLT

At Radix, our strategic vision has always been to build a single decentralized network that can scale DeFi seamlessly to every single person on the planet (even though we didn’t initially know that DeFi was going to be called DeFi).

www.radixdlt.com

Cerberus Infographic Series - Chapter I | The Radix Blog | Radix DLT