Implementing Storage Rent in RSK — Part 1

Shreemoy Mishra
8 min readDec 9, 2020

--

Storage rent is a proposal to add a data access fee to transactions on the RSK blockchain network. A key difference from current storage-related gas costs (e.g. SSTOREor SLOAD) is that storage rent depends on both the amount of data accessed, as well as the duration for which data are stored in blockchain state. Storage rent offers an additional method to protect the network from IO based DoS attacks.

In this post, we go over the motivation and basic structure of the proposal. In part 2 of this series, we present benchmarking results to quantify the performance impact of implementing storage rent in the reference java based node (RSKJ).

What is state data?

For our purpose, state data refers to three types of information that are essential to execute transactions on ethereum like blockchains such as RSK

  • Account information (native coin balance, nonce)
  • Smart Contract Code (stored as EVM bytecode)
  • Contract Storage (e.g. ERC20 token balances)

The state of the blockchain evolves as transactions get mined and included in blocks. For processing transactions, only the current state matters and this is based on the most recent block.

Aside: Historical state (e.g. some account’s balance on March 14, 2020) has no relevance for processing transactions. However, should anyone need it, historical state at any point can be computed by replaying the blockchain from genesis till that point. This is similar to what new nodes do when syncing with the blockchain for the first time. Since this is a very cumbersome process, historical state is often retrieved from service providers running full archive nodes who maintain several snapshots of historical state. But we digress. Let us get back to storing current state.

Where is current state data stored?

To process transactions and blocks quickly, the most recent state data should be in a blockchain node’s memory (RAM). However, as the number of users, smart contracts, and application data grow, it is infeasible to store all state data in RAM and it is necessary to store some part (if not most) of state on disk.

When a transaction needs state data that is not in a primary cache (or RAM), that data has to be fetched from a secondary layer cache or, in the worst case, from disk. As state size grows such disk IO operations become more frequent. Increases in IO operations degrade node performance and by extension, network performance suffers as well.

State IO (DoS) Attacks

Attackers can craft transactions to require more IO operations leading to denial of service (DoS).The Ethereum network experienced such an attack in 2016. These are sometimes called the Shanghai attacks, apparently because they happened while Ethereum co-founder Vitalik Buterin was on a flight to Shanghai for a conference. In response, Ethereum raised the costs of several IO op-codes (by orders of magnitude) via EIP-150.

For example, consider an SLOAD operation in the Ethereum Virtual Machine (EVM). This opcode instructs the EVM to read a single piece of data from a contract’s storage cell. EIP-150 raised the cost of this operation from 50 gas to 200 gas. The cost of EXTCODE* — a family of IO-heavy op-codes — was increased from 20 to 700. These changes raised the costs of carrying out an IO attack. Further increases were proposed in EIP-2929, which raise the cost of an SLOAD to 2100 gas and that of EXTCODE* to 2600 gas. Such gas-repricing decisions are one way to protect the network from IO attacks.

Implementing storage rent offers another way of defending the network from DoS attacks

To understand how storage rent works, let us recall how state is stored and retrieved.

How is state data stored?

State data is stored in data structures called Tries, which are prefix trees. The prefix helps quickly organize and retrieve information related to particular accounts. While Ethereum uses a combination of hexary (radix-16) trees, RSK uses a unified binary tree called the Unitrie. Individual pieces of state data — e.g. account state, contract code, and contact storage — are stored within nodes in the unitrie. These value-containing nodes typically have no children (i.e. they are usually leaf nodes).

The trie is an abstract object. Actual state data is backed up in a key-value datastore where each <key-value> pair refers to a particular node in the Unitrie. See the linked Unitrie post for details of how keys are created using hashes of account addresses and how values are encoded.

How does state evolve?

Whenever some state data changes, the relevant node is deleted from the trie. Updated values and freshly created accounts are stored by inserting new nodes into the trie.

Cost of storing state

There are direct and indirect costs of storing state. Existing storage related opcodes charge users for the computational burden of these trie and IO operations. However, they do not provide strong incentives for limiting the duration of storing information. Once data is stored in state, it can stay there forever at no further cost.

Storage device prices keep falling. Thus, the direct cost of storing state are small, especially if it is on disk. This cost gets magnified due to replication by each full node. But that is still small for RSK. However, — as mentioned earlier — increasing state size increases IO operations and the vulnerability to IO attacks. The goal of storage rent is to reduce these indirect costs.

Storage rent provides time incentives

Cloud data storage services like Amazon’s S3 charge for the size, duration, and even for bandwidth (data flows) from ‘storage buckets’. The idea of storage rent for blockchains is neither new nor surprising. There is a history of previous proposals in Ethereum or RSK. However, past proposals were quite complex, and lacked broad community support. RSKIP113 distilled and simplified the previous approaches. The current proposal is even simpler.

Main elements of the Proposal

Who pays rent?

A transaction’s sender is charged storage rent for each data-containing Unitrie node touched by that transaction.

Example: An ERC20 token transfer transaction involves at least the following trie nodes

  • account state nodes for the sender and recipient (of the token transfer)
  • the node containing the token contract’s account information
  • the node containing the contract’s bytecode
  • the contracts’s storage root and the storage cells containing the sender and recipient’s token balances.

How is rent paid?

Rent is collected in units of gas just like usual transaction fees. To handle rent payments, a transaction’s gasLimit is internally divided into separate budgets to cover execution gas (current fees) and storage rent gas (proposed). All unused (leftover) execution gas and rent gas are refunded at the end of the transaction.

Will all transactions be charged rent?

No. Several transactions may not have any rent collection. To avoid collecting small amounts there are rent collection triggers. If the data stored in a trie node is modified by the transaction, then we collect rent for that node only if the outstanding amount of rent exceeds 1000 gas. However, if data from a trie node is accessed but not modified then we require a higher bar of 10,000 gas.

Thus, if two accounts are routinely used for a particular token transfer, then it is very likely that most of these transactions will not involve any rent collection as the outstanding amounts will be too small.

How is rent computed?

Rent tracking is based on a timestamp (lastRentPaidTime in Unix seconds) associated with each value-containing trie node. The rent for a node is based on

  • the node’s size (in bytes), which is the approximate size of encoded data plus a storage overhead of 128 bytes.
  • the time since last rent payment (in seconds)
  • and a rental rate of 1/(2^21) gas per byte per second

Example rent computations

The rental rate of 1/(2^21) is not easy to interpret. To place this in context, recall that a simple send transaction costs 21000 gas.

Here are some examples of what rent costs may look like.

  • An externally owned account: Rent: about 2075 gas/year
  • A smart contract with 10K bytes in code and 100 storage cells. Rent: about 380K gas/year
  • A very popular contract with 100,000 storage cells: Rent about 44K gas/day

These costs are fairly low. Ordinary users may not even notice them.

The rental rate has evolved over time and will be finalized according to community feedback based on level of deterrence needed against DoS attacks.

Rent for new trie nodes

Newly created accounts and trie nodes are charged some rent in advance. This is done by setting their rent paid timestamp to 6 months in the future. Which can be useful as a penalty for IO attacks where the attacker forces the node to look up non-existent nodes.

Who is rent paid to?

As with existing transaction fees, all rent is passed along to miners. In RSK, this is managed by the REMASC fee distribution contract.

How does this impact the size of blocks?

Rent gas does not count towards block-level gaslimit. Therefore, there is no impact on the number of transactions that can be included in a block.

What happens if rent remains unpaid?

Nothing. The data remains in state, but the transaction will OOG (out of gas exception) and will not get executed. For transactions to get executed, they must include enough gas to cover all outstanding rent that is due — provided the outstanding amounts are high enough to be worth collecting (small amounts are not collected).

Will rent work with existing wallets?

We intend to implement rent in a way so that existing wallets can continue as before without breaking changes. This is easier in cases where wallets use RPC based calls to estimateGas prior to sending transactions.

However, the implementation has to be mindful of some predictable failures. For example, some wallets simply use 21000 gas as default for pure send transactions (and do not use estimateGas). One approach to handle such situations is to turn rent computations off for transactions that do not include any data (i.e. pure transfers).

We will seek community feedback for other breaking scenarios or edge cases.

Other benefits of implementing storage rent

  • Encourage sensitivity to duration of storing data. This is primarily for developers, contract designers, and service providers.
  • Limiting gas arbitrage and discourage storage spam.
  • Rent timestamps can be used for state caching and node hibernation.
  • Since rent is an additional source of fees, it may permit us to reduce some existing fees.

Why has Ethereum not implemented storage rent?

They have tried (for years)! For example see EIP-1682 and this proposal by Alexey Akhunov which has also received attention. However, despite support from Vitalik Buterin, there is no consensus and discussions appear to have stalled. Mostly, the proposals in ethereum have been a bit too complex to gain broad support. The emphasis has shifted to other things like Eth 2, and layer 2 scaling solutions.

Next steps

In part 2 of this series we discuss some (engineering) performance consequences of implementing storage rent in the reference Java implementation of an RSK node i.e. RSKJ.

--

--