Sharding

Sharding [ /ˈʃɑːdɪŋ/ ] is a method of partitioning a database horizontally across separate servers to improve scalability, performance and data availability.

In the context of DLTs, sharding refers to the process of dividing the network's computational and storage workload across multiple smaller groups of nodes, called shards, each responsible for processing a subset of the network's transactions and storing a portion of the global state.

Background

The primary goal of sharding in DLTs is to increase the overall throughput and capacity of the network without sacrificing decentralization or security. By allowing multiple shards to process transactions in parallel, sharding aims to overcome the limitations of ‘full replication’ where every node must process and store all transactions.

Sharding has gained significant attention in the blockchain community as a potential solution to the scalability trilemma, which posits that blockchain systems can only achieve two out of three desirable properties: scalability, security, and decentralization. By implementing sharding, projects such as NEAR, MultiversX and Radix aim to maintain high levels of security and decentralization while dramatically improving scalability.

The concept of sharding in DLTs extends beyond simply partitioning data; it encompasses complex mechanisms for ensuring data availability, cross-shard communication, and maintaining overall network consistency.

Technical Fundamentals

Sharding is a concept borrowed from traditional database systems and adapted for use in distributed ledger technology.

Database Sharding

In traditional database systems, sharding is a method for distributing data across multiple machines. It involves breaking a large database into smaller, more manageable partitions called shards. Each shard contains a subset of the data and is stored on a separate server. This approach allows for horizontal scaling, where additional servers can be added to increase capacity and performance.

There are two main types of database sharding:

Vertical Partitioning: Different tables from the same database are stored in different instances.
Horizontal Partitioning: A database table is split into separate sets of rows, stored in different database instances.

Application to Blockchain and Distributed Ledgers

In the context of blockchain and distributed ledgers, sharding involves dividing the network's computational and storage workload across multiple smaller groups of nodes, each responsible for processing a subset of the network's transactions and storing a portion of the global state.

The key difference in blockchain sharding is that it must maintain the security and decentralization properties of the network while improving scalability. This involves complex mechanisms for ensuring data availability, cross-shard communication, and maintaining overall network consistency.

Radix, for example, implements a unique approach called "pre-sharding", where the network launches with a maximum number of shards (2^64 or approximately 18.4 quintillion) already in place. This allows for future scalability without needing to change the fundamental structure of the network as it grows.

Key Goals: Scalability, Decentralization, Security

The primary objectives of implementing sharding in distributed ledgers are:

Scalability: By allowing parallel processing of transactions across multiple shards, the overall throughput of the network can be significantly increased. In theory, this can lead to "quadratic sharding," where the capacity of the network grows quadratically with the computational power of individual nodes.
Decentralization: Sharding aims to maintain or even improve decentralization by allowing the network to run on consumer-grade hardware. This is in contrast to some high-throughput chains that rely on a small number of powerful nodes.
Security: A well-designed sharding system should maintain the security properties of the network, ensuring that an attacker cannot easily compromise the system by targeting a small subset of shards.

The challenge in implementing sharding lies in achieving all three of these goals simultaneously, effectively addressing the scalability trilemma that has long plagued blockchain networks.

Types of Sharding

Sharding in distributed ledger technology can be implemented in various ways, each focusing on different aspects of the network's operation. The main types of sharding are network sharding, transaction sharding, and state sharding. These approaches can be used individually or in combination to achieve the desired scalability improvements.

Network Sharding

Network sharding involves dividing the network nodes into smaller groups, each responsible for a subset of the network's tasks. In this approach, nodes within a shard communicate more frequently with each other than with nodes in other shards. This can reduce the overall network communication overhead and improve efficiency.

Radix, for example, implements a form of network sharding where each node maintains as many shards as it can handle, dropping shards that are too resource-intensive. This allows even low-powered devices to participate in the network, promoting decentralization.

Transaction Sharding

Transaction sharding involves distributing the processing of transactions across different shards. In this approach, each shard is responsible for processing a subset of the total transactions in the network. This allows for parallel processing of transactions, potentially increasing the overall throughput of the network.

One challenge with transaction sharding is ensuring that transactions that depend on each other or affect the same state are processed correctly. Radix addresses this by using a deterministic process to assign transactions to shards based on the public keys involved, ensuring that related transactions are always processed in the same shard.

State Sharding

State sharding is perhaps the most complex form of sharding. It involves dividing the global state of the network (i.e., the full record of all accounts and their data) across multiple shards. Each shard maintains only a portion of the global state, which can significantly reduce the storage and computational requirements for individual nodes.

However, state sharding introduces challenges in cross-shard communication and maintaining consistency across the network. Ethereum 2.0's sharding plan, for instance, initially focused on data sharding rather than full state sharding to mitigate some of these challenges.

The implementation of state sharding often requires sophisticated techniques such as fraud proofs or zero-knowledge proofs (ZK-SNARKs) to ensure the validity of cross-shard transactions and maintain the overall security of the network.

Security Considerations

While sharding offers significant scalability benefits for distributed ledger systems, it also introduces new security challenges that must be carefully addressed. This section explores the key security considerations in sharded systems and the solutions proposed to mitigate these risks.

Single-Shard Takeover

One of the primary security concerns in sharded systems is the potential for an attacker to take over a single shard. In a traditional blockchain, an attacker typically needs to control a majority of the network's resources to carry out an attack. However, in a sharded system, an attacker might only need to control a majority within a single shard to cause damage. Random sampling for validator assignment has been suggested to prevent targeted attacks on specific shards.

Data Availability

Ensuring data availability is crucial in sharded systems. If data becomes unavailable, it can lead to stalled chains or even allow for ransom attacks on specific user data. This is particularly challenging in a sharded environment where not all nodes have access to all data.

Proposed solutions are:

Data availability sampling to ensure that all necessary data is available without requiring nodes to download entire blocks.
Erasure coding to provide redundancy and facilitate data availability checking.

Cross-Shard Transaction Atomicity

Maintaining atomicity for transactions that span multiple shards is a significant challenge. Ensuring that all parts of a cross-shard transaction are executed correctly or all fail is crucial for preserving the integrity of the system.

Different projects have proposed various solutions:

Ethereum's proposed "yanking" mechanism allows for objects to be moved between shards.
Radix's Cerberus consensus protocol implements a "braiding" technique that allows for atomic composability across shards.

These mechanisms aim to ensure that cross-shard transactions are executed correctly and atomically, maintaining the consistency of the overall system.

Adaptive Adversaries

Sharded systems can be vulnerable to adaptive adversaries who can quickly compromise or shut down specific nodes in real-time. This poses a particular threat to systems that rely solely on committees for security.

History and Development

The concept of sharding in distributed ledgers has evolved significantly over time, with various approaches proposed and refined to address the scalability challenges of blockchain networks. This section traces the history and development of sharding from early proposals to modern implementations.

Early Proposals (2015-2017)

The idea of applying sharding to blockchain systems began to gain traction in the mid-2010s:

In 2015, a team at the National University of Singapore (NUS) proposed an early Byzantine Fault Tolerant (BFT) sharding approach.
Ethereum's Vitalik Buterin started discussing sharding concepts for Ethereum as early as 2016.
Other early conceptual work included proposals like "puzzle towers" by Dominic Williams, which aimed to combine proof-of-work with sharding.

These early proposals often focused on sharding either transaction processing or state, but not both, which limited their potential scalability gains.

Evolution of Approaches (2018-2020)

As research in sharding progressed, more comprehensive approaches began to emerge:

Radix's initial consensus protocol, Tempo, was designed to be sharded from the start. It used a novel approach called "logical clocks" for consensus.
Zilliqa, launched in 2019, implemented a form of transaction sharding.
The concept of "quadratic sharding" was developed, aiming to scale both processing power and storage capacity.
Researchers began to address key challenges such as cross-shard transactions and data availability.

During this period, the focus shifted towards creating more secure sharding models that could maintain decentralization. This included the development of techniques like:

Modern Implementations (2021-present)

Recent years have seen more concrete implementations and refined approaches to sharding:

Ethereum 2.0 initially focused on data sharding rather than computation sharding. The original plan involved creating 64 separate shard chains, each capable of processing transactions and smart contracts independently.
Other projects like Near Protocol have implemented their own versions of sharding.

Modern sharding approaches often combine multiple techniques to achieve scalability while maintaining security:

Sharding in Radix

Radix has developed an integrated sharding and consensus architecture specifically designed for hyper-scalability of its decentralized network. In Radix’s case, sharding applies to both data availability and transaction execution as both functions are performed by nodes.

Ledger Pre-Sharding

The current Radix Mainnet (Babylon) is sharded into a fixed number of 2^256 shards. Responsibility for validating shards is undertaken by groups of validators called shard groups, which may grow or shrink dynamically in response to load demand. Currently, the number of shard groups is capped at one but this will be lifted with Radix’s forthcoming Xi’an release.

Pre-sharding is in contrast to the dynamic adaptive state sharding model adopted by Shardeum, MultiversX, and NEAR, where shards are added incrementally as required. While sharding can improve scalability, an ad hoc approach to sharding leads to substantial difficulties as any changes to the shard structure require reorganizing the entire network - a time consuming and expensive process. The larger the sharded ledger grows, the more problematic this becomes. Ad hoc sharding also complicates queries and data lookups within the ledger. By sharding the data randomly, it becomes much harder to locate specific transactions or data points since they could be stored anywhere. This slows down queries as more extensive searches are required.

Deterministic Shard Indexing

Shards on Radix are indexed deterministically by public keys. This means that the shard index for any address can be calculated by taking the modulo of the public key over the shard space.

\Large s_i = \frac {\mathrm{mod}~p_i}{S} \qquad \footnotesize \qquad \begin{array}{l} s = shard~index \\ p = public~key\\S = total~ shard~space \end{array}

By deterministically grouping related data into the same shard, Radix avoids the need for expensive data reorganization as the network grows. This creates four major advantages:

Proximity: All transactions from a particular account are guaranteed to be in the same shard, which makes it trivial to identify attempted double-spends.
Asynchrony: Transactions from separate accounts will always involve separate shards, enabling asynchronous, parallel processing of unrelated transactions.
Indexing: Lookup complexity and query time are reduced since shard locations can be easily derived from public keys.
Load balancing: Hash sharding typically results in a more uniform distribution of data across nodes.

Network Security

A key challenge in sharding distributed ledgers is ensuring sufficient security and node coverage across all shards. If some shards have much fewer nodes than others, it creates vulnerabilities. Radix employs several techniques to maintain security across its sharded network:

Node Identity Shard Mapping: To secure the network, validator node addresses are mapped to a single ‘root’ shard. Nodes must permanently maintain their root shards, but can support additional shards to earn more transaction fees. Underserved shards offer higher returns, attracting more validators and preventing any shards from being overlooked. This free market approach maintains security even as the network scales.
Incentives for Multi-Shard Validation: Based on factors like computing resources, validators can choose to support additional shards beyond their root shard. The more shards a node supports, the greater the amount of transaction fees it can earn. This creates an incentive for validators to support as many shards as feasible to maximize profits. In this way, the overall validation workload is distributed across nodes.
Dynamic Shard Support via Free Market: As the network grows, some shards may end up with fewer nodes supporting them compared to other oversubscribed shards. These underserved shards then inherently offer higher potential returns since there is less competition for fees. The higher relative profits attract more validators to begin supporting the underserved shards. This brings coverage back into equilibrium across shards through a free market approach.
Scaling Security Through Staking: In proof-of-stake networks like Radix, staking provides additional security. The more tokens a validator stakes, the more shards it can validate. This allows validation load to scale up securely. High stake validators may validate transactions across many shards in parallel for efficiency. However low stake nodes still play a key role in providing decentralized shard coverage.

Together, these mechanisms ensure Radix can securely scale to an exponentially growing shard space without running into coverage gaps or centralization issues. The network organically self-regulates to distribute validation across shards.

Cerberus Consensus

Main article: Cerberus

Radix's Cerberus consensus protocol introduces ‘braided’ sharding to atomically compose transactions across shards. Cerberus shards transaction validation while braiding validation across shards to enforce system-wide transaction ordering and prevent double-spending. This unique braided architecture ensures that Radix can securely scale transaction throughput across a sharded network of effectively unlimited size.