Data availability sampling lets light clients verify transaction data by randomly checking a small fraction of it. The article states that 30-40 samples are typically needed to be 99.9% confident the full block is available.
Enter values to see your calculation results
When you send a crypto transaction, you assume it’s recorded forever. But how do you know it actually was? In monolithic blockchains like Bitcoin or early Ethereum, every node downloads and checks every single transaction. That works fine when the network is small. But as usage grows, this model breaks. Nodes need more storage, more bandwidth, more power. Most users can’t run them anymore. And that’s exactly when data availability layers became essential.
A data availability layer (DAL) is a dedicated part of a modular blockchain that only does one thing: makes sure transaction data is published and accessible. It doesn’t process transactions. It doesn’t run smart contracts. It doesn’t settle payments. It just stores the raw data - the list of who sent what to whom - and proves it’s there.
Think of it like a public bulletin board. Every morning, a new notice is posted with all the transactions from the previous day. A data availability layer ensures that notice is physically posted, not hidden, and that anyone can check it’s real without reading every single line. That’s the core idea behind data availability sampling.
Before DALs, rollups - the main scaling solution for Ethereum - had a problem. They processed thousands of transactions off-chain but had to post all the data back on Ethereum. That got expensive. Gas fees spiked. The solution? Move the data storage off the main chain and into a layer designed just for that job.
The magic behind data availability layers isn’t complex cryptography - it’s smart sampling. Instead of downloading a whole block of data (which could be megabytes), your phone or laptop downloads just 30 to 40 random pieces of it. If those pieces check out, you can be 99.9% confident the full block is available.
This works because of erasure coding. Before data is posted, it’s stretched out using math (Reed-Solomon codes). If you lose half the data, you can still rebuild the whole thing from the other half. So even if some nodes go offline or try to hide data, as long as enough fragments are out there, the system stays safe.
KZG polynomial commitments make this even more efficient. They let nodes prove, in just a few hundred bytes, that the data was correctly encoded. No need to recompute everything. Just verify the proof. That’s how light clients - devices with limited storage - can participate securely.
Before this, the only way to trust data was to download it all. Now, you can trust it by sampling a tiny fraction. That’s the breakthrough.
Celestia launched its testnet in September 2021 as the first blockchain built solely for data availability. No execution layer. No smart contracts. Just data. And it works.
As of Q3 2023, Celestia handles about 1.25 MB per block. That’s over 10 times more than Ethereum’s current 90 KB limit. Its light nodes need only 1-2 GB of storage. Compare that to Ethereum full nodes, which now require over 1.2 TB. Celestia’s network achieved 99.98% uptime in 2023. Its throughput? 300-500 transactions per second. Not fast by traditional standards, but perfectly tuned for its job.
Developers building rollups on Celestia report 87% lower costs. But there’s a catch: tooling is still immature. Only 15 active rollups were live on Celestia by November 2023. Most developers still use Ethereum. Why? Because Ethereum has the users, the liquidity, the wallets. Celestia is a powerful engine - but it needs more cars.
Ethereum didn’t build a separate DAL. It’s upgrading its own chain. The plan, called proto-danksharding (EIP-4844), launches in Q2 2024. Instead of storing data directly in blocks, Ethereum will use ‘blobs’ - temporary, low-cost data containers.
These blobs won’t be readable by smart contracts. They’re just for data. That’s intentional. It keeps the execution layer simple. Ethereum will still handle consensus and settlement, but data storage becomes cheaper and faster. The Ethereum Foundation estimates this will cut rollup transaction costs by about 90%.
But it’s not simple. Integrating KZG commitments into Ethereum’s existing codebase has been a nightmare. As of November 2023, there were 123 open GitHub issues on the danksharding implementation. The delay from Q4 2023 to Q2 2024 shows how hard this is.
Still, the payoff is huge. Once live, Ethereum will handle up to 1.31 MB per block - enough to support 100,000 transactions per second. That’s not just scaling. It’s a new class of blockchain.
Not everyone wants to use Celestia or wait for Ethereum’s upgrade. Enter EigenDA and data availability committees (DACs).
EigenDA, built on EigenLayer, uses Ethereum’s existing security but stores data off-chain. In testnet benchmarks, it handled 100,000 transactions per second with costs as low as $0.0001 per transaction. That’s 10,000 times cheaper than Ethereum’s mainnet gas fees. But it’s still in testnet. Mainnet launch has been delayed, causing frustration for enterprise users.
DACs, like the ones used by StarkWare’s StarkEx, rely on trusted groups of validators to attest that data is available. It’s faster and cheaper than on-chain storage, but it’s not fully trustless. If the committee colludes, you’re vulnerable. That’s why DACs are used for enterprise apps - where trust is managed contractually - not for open DeFi.
Monolithic chains like Solana try to do everything fast. But they crash under pressure. Solana had seven major outages in 2022. Why? Because one component failing - say, the execution engine - brings down the whole chain.
Modular blockchains fix that. If your data layer goes down, your execution layer can pause and wait. If your execution layer has a bug, your data layer stays intact. That’s resilience.
And it’s growing fast. Messari reported that investment in data availability layers jumped from $25 million in 2021 to $420 million in 2022. By 2027, the market could hit $8.7 billion. Celestia holds 35% of the dedicated DAL market. Avail, Polygon’s modular layer, is targeting enterprise adoption. Coinbase and Binance have invested hundreds of millions into this space.
Despite the progress, real-world adoption is slow. Developers say the biggest hurdle isn’t tech - it’s tooling. Most Ethereum devs know Solidity and EVM. Celestia uses Cosmos SDK. Only 12% of developers are comfortable with it. Documentation is better than it was, but still uneven. Celestia’s docs got 4.2/5 stars. EigenDA’s got 4.7/5. But neither is as polished as Ethereum’s.
Interoperability is another issue. Can a rollup on Celestia talk to one on EigenDA? Not yet. The Interchain Foundation just funded a $5 million project to fix that. Without standards, we’ll end up with isolated islands of data.
Regulation is coming too. The EU’s MiCA framework, effective December 2024, will require all blockchain transactions to have verifiable data availability. That’s not a suggestion - it’s a legal requirement. DALs aren’t just a technical upgrade anymore. They’re compliance infrastructure.
If you’re a developer building a dApp, you need to understand DALs. Your users will demand lower fees. Your app won’t scale without them.
If you’re an investor, DALs are the infrastructure layer behind the next wave of crypto growth. The companies building them aren’t just startups - they’re the new foundations of the web.
If you’re just a user, you won’t see the data layer. But you’ll feel its effects: faster transactions, cheaper fees, fewer crashes. That’s the real win.
Data availability layers aren’t flashy. They don’t have NFTs or meme coins. But they’re the quiet backbone making blockchains actually usable at scale. And that’s the most important upgrade crypto has seen in years.
The main purpose of a data availability layer is to ensure that all transaction data from a blockchain is published and accessible to network participants without requiring every node to download and verify the entire dataset. It solves the data availability problem by allowing light clients to verify data integrity through sampling, enabling scalable rollups and modular blockchain architectures.
Data availability sampling lets light clients check if transaction data is available by randomly downloading 30-40 small pieces of a block. Using erasure coding, if those samples are valid, there’s a 99.9% chance the full block is intact. This removes the need to download megabytes of data, making blockchain participation feasible on phones and low-power devices.
Celestia is a dedicated, standalone blockchain built only for data availability. It doesn’t execute transactions or run smart contracts. Ethereum, on the other hand, is adding data availability as a feature within its main chain using proto-danksharding (EIP-4844), which introduces temporary data blobs. Celestia offers higher throughput and lower node requirements, while Ethereum benefits from its existing security and user base.
Rollups process transactions off-chain to save costs and increase speed. But they must post transaction data back on-chain to ensure security. Without a dedicated data availability layer, this data competes with other transactions for space on Ethereum, driving up gas fees. DALs provide cheap, high-capacity storage just for this data, making rollups viable at scale.
Yes, when properly implemented. Security relies on cryptographic proofs (KZG commitments) and statistical sampling. Research from the University of Illinois confirmed the mathematical soundness of data availability sampling. However, security depends on parameters like sample size and network conditions. If too few samples are taken, or if a large portion of the network is compromised, the system could be vulnerable - which is why careful design is critical.
The biggest challenges are immature tooling, lack of developer familiarity with non-EVM stacks (like Cosmos SDK), and poor interoperability between different DALs. While technical performance is strong, adoption is held back by complexity and fragmentation. Regulatory pressure from MiCA may help push standardization, but widespread use still requires better documentation, easier SDKs, and cross-chain bridges.
Tejas Kansara
25 11 25 / 13:25 PMThis is the quiet revolution no one talks about but everyone uses.
John Borwick
25 11 25 / 13:50 PMBeen running a light node on Celestia for months now. My old laptop doesn't cry anymore when syncing. The only thing missing is better docs and a wallet that doesn't feel like it's from 2017. Still, this is how it should've always been.