Sharding [ /ˈʃɑːdɪŋ/ ] is a method of partitioning a database horizontally across separate servers to improve scalability, performance and availability.
In distributed ledgers (DLTs) like Radix, sharding is used to allocate both data storage and transaction execution across a decentralized network of nodes to achieve a high transactional capacity.
Sharding in Radix
Radix has developed an integrated sharding and consensus architecture specifically designed for hyper-scalability of its decentralized network. In Radix’s case, sharding applies to both data availability and transaction execution as both functions are performed by nodes.
Ledger Pre-Sharding
The current Radix Mainnet (Babylon) is sharded into a fixed number of 2^256 shards, covered by a single shard group. The shard group cap will be lifted with Radix’s forthcoming Xi’an release.
This is in contrast to the dynamic adaptive state sharding model adopted by Shardeum, MultiversX, and NEAR, where shards are added incrementally as required. While sharding can improve scalability, an ad hoc approach to sharding leads to substantial difficulties as any changes to the shard structure require reorganizing the entire network - a time consuming and expensive process. The larger the sharded ledger grows, the more problematic this becomes. Ad hoc sharding also complicates queries and data lookups within the ledger. By sharding the data randomly, it becomes much harder to locate specific transactions or data points since they could be stored anywhere. This slows down queries as more extensive searches are required.
In Radix, the responsibility for validating shards is undertaken by groups of validators called shard groups, which may grow or shrink dynamically in response to load demand.
The Radix shard space is currently managed by a single shard group but this cap will be lifted upon the release of Xi’an, enabling validators to manage shards according to capacity.
Deterministic Shard Indexing
Shards on Radix are indexed deterministically by public keys. This means that the shard index for any address can be calculated by taking the modulo of the public key over the shard space.
By deterministically grouping related data into the same shard, Radix avoids the need for expensive data reorganization as the network grows. This creates three major advantages:
- Related data is grouped together - All transactions from a particular account are guaranteed to be in the same shard, which makes it trivial to identify attempted double-spends.
- Unrelated data is separated - Transactions from separate accounts will always involve separate shards, enabling fully asynchronous processing of unrelated transactions.
- Lookup complexity and query time are reduced since shard locations can be easily derived from public keys.
- Hash sharding typically results in a more uniform distribution of data.
Asynchronous Parallel Execution
With a fixed, deterministic shard space, unrelated transactions on Radix are guaranteed to be processed asynchronously in separate shards. This allows Radix to scale transaction throughput linearly by simply increasing the number of nodes, reducing the shard coverage of each one.
Network Security
A key challenge in sharding distributed ledgers is ensuring sufficient security and node coverage across all shards. If some shards have much fewer nodes than others, it creates vulnerabilities. Radix employs several techniques to maintain security across its sharded network:
- Node Identity Shard Mapping: To secure the network, validator node addresses are mapped to a single ‘root’ shard. Nodes must permanently maintain their root shards, but can support additional shards to earn more transaction fees. Underserved shards offer higher returns, attracting more validators and preventing any shards from being overlooked. This free market approach maintains security even as the network scales.
- Incentives for Multi-Shard Validation: Based on factors like computing resources, validators can choose to support additional shards beyond their root shard. The more shards a node supports, the greater the amount of transaction fees it can earn. This creates an incentive for validators to support as many shards as feasible to maximize profits. In this way, the overall validation workload is distributed across nodes.
- Dynamic Shard Support via Free Market: As the network grows, some shards may end up with fewer nodes supporting them compared to other oversubscribed shards. These underserved shards then inherently offer higher potential returns since there is less competition for fees. The higher relative profits attract more validators to begin supporting the underserved shards. This brings coverage back into equilibrium across shards through a free market approach.
- Scaling Security Through Staking: In proof-of-stake networks like Radix, staking provides additional security. The more tokens a validator stakes, the more shards it can validate. This allows validation load to scale up securely. High stake validators may validate transactions across many shards in parallel for efficiency. However low stake nodes still play a key role in providing decentralized shard coverage.
Together, these mechanisms ensure Radix can securely scale to an exponentially growing shard space without running into coverage gaps or centralization issues. The network organically self-regulates to distribute validation across shards.
Cerberus Consensus
Main article: Cerberus
Radix's Cerberus consensus protocol introduces ‘braided’ sharding to atomically compose transactions across shards. Cerberus shards transaction validation while braiding validation across shards to enforce system-wide transaction ordering and prevent double-spending. This unique braided architecture ensures that Radix can securely scale transaction throughput across a sharded network of effectively unlimited size.
Advantages
The key advantage of sharding is that it enables parallel transaction processing and storage, significantly increasing throughput. By sharding the ledger and workload, each node only needs to maintain a subset of the overall data while still contributing to the security and integrity of the full ledger. However, sharding done through an ad hoc approach leads to difficulties. Later changes to the shard structure would require full network reorganization. Query complexity also increases with ad hoc sharding as locating data becomes more difficult. These challenges make ad hoc sharding unsuitable for distributed ledgers as they grow to Internet scale.
By breaking up large datasets into smaller, more manageable pieces, network nodes are better able to search through and retrieve individual data points as well as handle concurrent requests. This approach is especially beneficial when unsharded systems grow excessively large, resulting in performance degradation. By implementing sharding, data storage and processing tasks are distributed across multiple computers, enhancing the system's scalability.
Sharding is commonly used in distributed databases, where it allows for the efficient storage and retrieval of large amounts of data across multiple nodes. By dividing the data into shards and distributing them across multiple nodes, a sharded database can support more concurrent requests and handle larger volumes of data without slowing down or becoming overloaded.
In addition to improving scalability and performance, sharding can also help to improve the availability and reliability of a distributed system. By storing data on multiple nodes, a sharded system can continue to function even if one or more nodes fail, ensuring that the data remains accessible and that the system can continue to serve requests.
There are several different approaches to sharding, each with its own tradeoffs and benefits. Some common sharding strategies include range-based sharding, which divides data into shards based on a key value or range; hash-based sharding, which uses a hash function to distribute data across shards; and directory-based sharding, which uses a lookup table to determine which shard a piece of data belongs to.
Disadvantages
Sharding introduces unique challenges and complexities, such as determining the optimal strategy for partitioning data to ensure efficient operations.
Sharding's complexities within distributed ledgers hinge on ensuring single transactional occurrences. Given the spatially distributed nature of ledger shards, instantaneous consistency remains elusive. The CAP theorem suggests that a distributed data store can't simultaneously guarantee Consistency, Availability, and Partition tolerance.