Ethereum: What is the difference between CompactSize and VarInt encoding?

Ethereum: Understanding CompactSize and VarInt Coding

The Ethereum blockchain has long suffered from problems related to data compression and encoding. The two most widely used encodings in the ecosystem are CompactSize and VarInt (Variable Int), which until recently were considered interchangeable. In this article, we will look at the differences between these two coding schemes and find out why they were previously confused.

CompactSize encoding

Peter Wühle’s definition of CompactSize encoding is an important point for clarity. According to his article “On the compactness of Ethereum transactions”, CompactSize coding is defined as a simple replacement of certain symbols in the transaction data with shorter codes, which ultimately reduces the size of the data, while preserving its basic information. This approach is aimed at minimizing the requirements for storing transaction data without compromising their security or integrity.

In contrast, Greg Walker’s definition of VarInt encoding emphasizes the use of an array of variable-length integers (VLI) to store and transfer data. These VLIs are used to represent complex data structures in a compact and efficient way. VarInt is often considered a more complex encoding scheme than CompactSize, but both encodings can be used to reduce the size of transaction data.

VarInt encoding

VarInt encoding is widely used in many blockchain networks, including Ethereum. According to the Bitcoin Wiki, VarInt is defined as an array of unsigned integers representing a data structure. VarInt’s goal is to provide a compact and efficient way to store and transfer large amounts of data over the network.

The main differences between the VarInt encoding and the CompactSize encoding lie in their approach:

  • Structure: VarInt uses a VLI array, while CompactSize replaces certain characters with shorter codes.
  • Purpose: VarInt is intended for storing and transferring complex data structures, while CompactSize is mainly used to reduce the size of transaction data.

Why the confusion?

Unsurprisingly, the Bitcoin Wiki initially claimed that CompactSize was unrelated to VarInt. The reason is that Peter Vuille’s definition clearly describes CompactSize as a “simple replacement” approach, while Greg Walker’s definition emphasizes the use of VLI and complex data structures.

In fact, both coding schemes can be used together or independently. Although CompactSize can be used to reduce the size of transaction data, VarInt is still necessary for storing large amounts of complex data, such as smart contract code or network configuration.

Conclusion

The differences between CompactSize and VarInt encoding become apparent after understanding the correct definitions. Although both encodings aim to improve data compression, their approaches differ significantly. Peter Wuyl’s definition of CompactSize emphasizes its simplicity and focus on reducing transaction data, while Greg Walker’s definition emphasizes the use of VLI to store complex data.

By understanding these differences, developers and users can choose the encoding scheme that best suits their specific needs, ensuring efficient data transfer and storage on the Ethereum blockchain.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top