Plot compression is here

  • Novembre 21, 2023

Summary

  • Plot compression is now available from multiple parties. Chia will be releasing support through the beta program Soon™.
  • Chia designed the proof of space not to be infinitely compressible
  • Plot compression works with a “time-space” tradeoff, removing parts of the plot and recreating them on the fly during farming
  • By design, Chia is not going to turn into PoW due to diminishing returns. Plot size decrease is linear, and power increase is exponential!
  • Farmers can expect a small increase in energy use relative to their chosen compression level.

We’ll soon release a new version of the Bladebit plotter in the beta program, which enables plot compression. Special thanks to Harold Brenes at Chia for spending a lot of late nights developing this. The new compressed plots will contain the same number of proofs as your current plots, but they will require less space. How much less? That depends on the compression level, but the savings will be somewhere between 15% and 30%. This compression is possible due to tradeoffs between space and computation.

Plot compression is – and always will be – optional. The decision of whether to use compressed plots and the level of compression you should use comes down to several factors, which we will discuss later in this post.

Immediately, you will see that there are diminishing returns the higher the compression gets, but first, let’s step back and refresh how plots in PoST work.

PoW vs. PoST

PoW (Proof of Work) is an algorithm to secure blockchains like Bitcoin. In PoW, computers called miners continuously generate random numbers called hashes. These hashes act like lottery tickets. Once a winning lottery ticket is found, the miner creates a new block and collects a reward, and the process starts over again. Proof of Work was the first Nakamoto Consensus, which, among other things, makes it easy to validate that a given blockchain is the canonical one.

PoW is energy-intensive because each participating miner must continuously generate hashes to maximize their chances of winning. Of course, most hashes don’t win. They end up like losing lottery tickets – discarded, never to be seen again.

PoST (Proof of Space and Time) is the second – and only other – Nakamoto Consensus. Like PoW, it uses hashes that function like lottery tickets. The primary difference is that in PoST, the lottery tickets only need to be generated once. Instead of throwing them away after a single use, farmers store them inside files called plots. Each plot contains over 4 billion lottery tickets, yet a farming computer requires minimal resources to fetch a winning ticket because of how the plots are organized. And best of all, the plots can be reused for the disk’s life.

The PoST consensus enables Chia’s blockchain to have similar levels of security as Bitcoin’s while consuming less than 1% of the total network power.

Plot compression overview

While creating a Chia plot, the required amount of temporary storage is greater than the final size of the plot. After phase 1 of plotting, a plot is technically farmable but not represented in the smallest form possible. The latter phases of the plotting process perform algorithmic compression specific to the Chia plot format. We designed this compression to be maximally efficient while keeping 100% of the plot’s data intact. In other words, the original plot format is already compressed.

How can we compress the plots even more?

This new compression comes from a tradeoff between computation and storage. Rather than storing 100% of a plot’s data, some can be computed on the fly. As more data is omitted from a plot, the harvester must perform more computations to obtain a proof.

We have organized the level of plot compression into simple numbers, from 0 (no compression) to 7 (high compression). Each increase in the level of compression yields a linear decrease in the size of the resulting plot, with the tradeoff being an exponential increase in the computational power required for farming.

 

compression level GiB per plot GB per plot % reduction % change rewards
0 101.30 108.77 0 0.00%
1 87.54 93.99 13.61% 15.72%
2 86.03 92.37 15.10% 17.75%
3 84.46 90.69 16.64% 19.93%
4 82.86 88.97 18.23% 22.25%
5 81.26 87.25 19.81% 24.67%
6 79.65 85.52 21.39% 27.18%
7 78.05 83.81 22.97% 29.79%
Table 1. Compression Level vs. plot size

 

Farming with the existing plot format requires only a tiny amount of computational resources. Compressing the plots to Level 1 will result in a 13.61% reduction in size, yielding 15.7% more farming rewards, while requiring negligible additional processing power. The current levels indicate a linear reduction in bit dropping, (e.g., level 1 is 16 bit, level 2 is 15 bit) but an exponential increase in compute for decompression. Still, the exponential compute catches up quickly, making it impractical beyond a certain point. More levels can be added by tuning the plot parameters. In the future, optional GPU harvesting, more efficient algorithms, and offloads for the proof quality check will enable us to allow for a few more levels with reasonable overheads. When we get the enhancements written, expect us to unlock a few more additional compression levels.

What the Hellman? Time-space tradeoffs in Chia proofs of space

The original paper behind the idea for the Chia Proof of Space format, Beyond Hellman and the Chia Proof of Space Construction, documented the concept of “plotting tables,” which are designed to reduce the effectiveness of a “time space tradeoff”.

We construct functions that provably require more time and/or space to invert.

In Chia, we want disk space to be the scarce resource for blockchain consensus, not compute cycles, hence the name…Proofs of Space. Plotting can take time since it only has to be performed once per plot, and the proving (farming & harvesting) needs to be fast and efficient. Chia released some sample code before mainnet launch of how this may actually work with a Hellman Attack. In Chia, with plotting tables, performing these time space (or compute / space) gets exponentially harder as you traverse down the tables. Bram chose seven tables as a reasonable tradeoff between the amount of disk io required for a full proof of space lookup, protection from time space tradeoffs, and plotting time.

Figure 1. Chia plotting tables that make up a plot file

What does a k=32 plot actually look like?

Figure 2.1. A standard k=32 plot file

Drop the Bit

A k=32 plot of 101.3 GiB, or 108.8GB, contains around 2^32 proofs (4.29 billion proofs of space). The new compression method, which we are calling “bit dropping”, for levels 1-7 removes the entire table 1, and places a reduced amount of entries from table 1 into table 2 instead of the back pointers. Compression level 1 uses 16 bits, storing only 16 of the 32 bits required for a k=32. This is problematic when you go to fetch an entire proof of space, of course, because half the entries are missing. These now need to be generated on the fly, requiring some compute to look at all possible matches for the missing entries. This trick can be further extended by dropping more data, and then subsequently having to generate even more data on the fly when fetching a full proof of space! This is a form of lossless compression, as the k=32 plot will still contain the same amount of proofs (2^32), and the same plotid will always generate the same proofs of space.

Figure 2.2. A k=32 plot file with compression level 1
Figure 3. The new compressed plot format no longer has table one, but instead its entries are stored in table 2 with a reduced bit count. 

This explains how the different levels of compression work. Each compression level works by dropping more data, requiring more compute at harvest time. The necessary compute cycles to create the data on the fly scales exponentially until the amount of compute required reaches a full plot. Currently, Chia compression level 7 reduces the plot size 23%, meaning that a farmer will expect to earn 29.8% more rewards at the cost of extra energy while farming.

Figure 4. Measured compute for full proof of space decompression vs. compression level on an early build of Bladebit.

In time / space tradeoffs, plot size decreases linearly while compute required for farming increases exponentially.

Thankfully, by design, Chia farming consumes a small amount of energy compared to Proof of Work. Even with the higher levels of compression, most farmers will only see a 20-25% increase in energy consumption. We’re making iterative enhancements to this process, increasing efficiency. The exponential scaling stops farmers from consuming tremendous amounts of energy and getting an unfair advantage.

More space – Introduction to Effective Capacity.

Effective capacity is noted TBe, pronounced: “terabytes effective”. This is the usable storage space after replication, capacity utilization, and data reduction (compression, deduplication, etc.). For Chia farming with plot compression, we will normalize the effective capacity to a k=32, 101.3GiB / 108.8GB, with approximately 4.3 billion proofs of space. In Chia farming, rewards are earned on effective capacity after decompression, so TBe is the number farmers will need to use to estimate rewards. Since users will have a mix of plots with various K values, it is not adequate just to have the number of plots to estimate rewards.

Figure 5. A farmer with 500 terabytes of raw capacity will have a higher effective capacity at higher compression levels as their estimated farming capacity (e.g., by a pool) will appear higher

Compression TCO – No Free Lunch

It wouldn’t be a JM blog post without a good spreadsheet. A fundamental question for a farmer is what compression level to pick. It may be tempting to plot at the highest compression level to obtain the most storage efficiency and, therefore, the most rewards, but there is a compute overhead to decompress the plots during farming, which requires extra energy vs. baseline farming. Disk io is not affected much by the decompression. The inputs that materially affect the output are

  • Compression level
  • Number of plots
  • Farming system compute capability
  • Power cost ($/kWh)
  • Baseline power efficiency (in W/TB over an entire system)

To find the compute and energy overhead for farming, a farmer needs to plot a single plot at a specific compression level, then use the proof of space tool to perform a proof quality check and full proof of space for a given number of proofs. This tool can be looped over around 9 seconds to simulate the signage point time for Chia. You multiply the number of proofs per signage point by the filter to get the number of farmable plots with a given amount of compute on a given system. You must step down the compression level if the CPU is getting overwhelmed and taking too much time to decompress the proofs of space. Measuring power and energy is required to get an accurate view of the total cost of ownership and ROI because higher compression levels will increase energy consumption and operational expenditures.

You can use this spreadsheet to estimate the optimal compression level after measuring the estimated compute and energy overhead at a given farm size.

FAQ

Q: Why are you releasing a compressed plot format?
A: If a technology can be built, then someone will build it. In the case of compressed plots, if only a small number of farmers had access to this technology, they would gain an advantage over everyone else. By releasing a free and open-source compressed plot format, we enable everyone to use the most advanced technology to create their plots. No one will have a tactical advantage over anyone else.

Q: Will I be required to use compressed plots?
A: No. Compression is optional.

Q: Can I compress my existing plots?
A: No. Even though this technically could be possible, it would require more time and energy than simply creating new plots.

Q: Will I be at a disadvantage if I don’t use compressed plots?
A: Currently, there is parity in farming – everyone uses the same plot format, so the only way to win more rewards is to store more plots. If you choose not to replot, then as others do replot, your relative share of the Netspace will slowly decrease, even if the number of disks that store plots stays the same. This will place you at a slight disadvantage relative to everyone else. However, it also means that you won’t spend any additional time or electricity on replotting. The decision is entirely yours.

Q: In the future, will there be any additional plot formats that compress the plots even more?
A: Not likely. When we released the current plot format back in 2020, we knew that certain tradeoffs and improvements would be possible with the right time and technology, which we are now seeing. However, today we are approaching the theoretical limits of compression. You are unlikely to see a new format that further compresses the plots by more than a tiny percentage. We mentioned in the post that we will likely bring a few more levels that approach the limit once we have the capability to offload the decompression.

Q: Is the Bladebit Beta compression going to be the final plot format?
A: The Chia client will support the version released in the beta program. There is an additional ~1-2% of optimization in the plot format for minor algorithmic optimizations that we have identified, that we may release at some point. When farming on SSDs becomes viable towards the second half of the decade, the plot format can be changed again due to the low latency of flash vs. HDDs. These small changes have diminishing returns when you look at a major decrease in plot size, like the compute space tradeoffs that we have implemented.

Q: Does it matter what hardware I use to create my plots?
A: All plots created by Bladebit with the same k-value and compression level will look the same, regardless of the hardware used to create them.

Q: Can I continue to use k-32 plots?
A: Yes. We have no plans to increase the minimum k at this time. If we ever do intend to increase this value, you will be notified at least one year in advance.

Q: Can I mix and match the compression levels of my plots?
A: Yes, you can mix and match compression levels of plots. A farmer should use the tools to baseline the compression level, compute overhead, and power to understand how it affects their farming profitability.

Q: Is the method to compress the plots called a Hellman Attack?
A: Similar method. Both the bit dropping method of time-space tradeoffs and a Hellman Attack work by removing one or more tables, but recreating the entries is far more complex in a Hellman attack, and requires increased time for plotting. The method we use reduces plotting time for writing the removed tables to disk.

Q: When will it be released?
A: Bladebit Diskplot and Ramplot with support for compressed plotting are feature complete and undergoing internal testing. The Chia client support with farming the compressed plots is being actively finished, and we will release the entire package with cross-platform and OS support into the beta program when it is feature complete.

Q: Will it take less time to create a compressed plot?
A: The benefit of the method we are using for the compute space tradeoff is that some of the tables are now not required to be stored in the final plot file, reducing the plotting time by a small amount.

Q: How will this affect my raspberry pi which runs as a harvester?
A: Farmers and harvesters that are low energy and compute capability will be able to run at lower levels of compression (level 1-4). The amount of storage farmed has a major impact on the compute overhead. We will develop enhancements to the harvester protocol that allow a centralized farmer to decompress plots from remote harvesters, improving energy efficiency and allowing for low cost / compute harvester support.

Q: Sounds awesome! When are we releasing these enhancements to the harvester protocol?
A: You can always check roadmap.chia.net for the latest on priority and planning for features, as well as leave comments for our product team.

original source – www.chia.net/2023/01/20/plot-compression-is-here/

You must be logged in to post a comment.