Skip to content

Latest commit

 

History

History
113 lines (98 loc) · 7.1 KB

tip-555.md

File metadata and controls

113 lines (98 loc) · 7.1 KB
Tip: 555
Title: Network upgrade logic optimization
Author: allen.cheng@tron.network
Status: Final
Type: Standards Track
Category: core
Date: 2023-06-09

Simple Summary

This TIP is to specifically explain the current problems and solutions in the network upgrade logic in java-tron.

Abstract

The network upgrade strategy currently adopted by java-tron is to complete the upgrade after reaching the specified network upgrade time and the minimum percentage of upgraded SRs. On this basis, specific business logic upgrades can also be implemented through proposals. After a SR produces a block, the version (java-tron version currently used by the SR) field will be filled in the block header. Nodes will record the SR's version when processing the block, and then confirm whether the particular version meets the upgrade conditions.

Motivation

Currently, the network upgrade logic adopted by java-tron has some problems in certain scenarios. For example, in the scenario where the number of SRs changes, the upgraded version may roll back, or the state array may be accessed out of bounds; the lower version cannot be upgraded to the higher version, etc. In order to avoid abnormalities during network upgrades, it is necessary to solve these problems.

Rationale

The current network upgrade implementation logic has the following problems:

Percent Calculation Problem

When calculating the percentage of upgraded SRs, the denominator uses the number of current SR lists. If the number of SR lists changes from 5 to 6, the calculation results for historical versions may be wrong. For example, as shown in Figure 1, assuming a percentage of 80%, versions 1 and 2 have been upgraded. After the status of the non-upgraded version is reset during the maintenance period, the result becomes the picture on the right. The states of versions 1 and 2 have also been reset, because (4 / 6 < 80%). Reset results are not as expected. image

Version Roll Back

case 1

If an SR actively downgrades the local version, other nodes will perform downgrading when processing the blocks produced by this SR, and reset the bit corresponding to the status of the not yet upgraded version that is greater than this SR’s version. The code is as the following,

private void downgrade(int version, int slot) {
    for (ForkBlockVersionEnum versionEnum : ForkBlockVersionEnum.values()) {
      int versionValue = versionEnum.getValue();
      if (versionValue > version) {
        byte[] stats = manager.getDynamicPropertiesStore().statsByVersion(versionValue);
        if (!check(stats) && Objects.nonNull(stats)) {
          stats[slot] = VERSION_DOWNGRADE;
          manager.getDynamicPropertiesStore().statsByVersion(versionValue, stats);
        }
      }
    }
  }
  
  private boolean check(byte[] stats) {
    if (stats == null || stats.length == 0) {
      return false;
    }
    for (int i = 0; i < stats.length; i++) {
      if (check[i] != stats[i]) {
        return false;
      }
    }
    return true;
  }

The check method is a full SR check. Using this method to check the version upgraded by percentage will cause problems. For example, the state array of version 2 is (1, 1, 0, 1, 1). Assuming that the passing percentage is 80%, then version 2 has been upgraded. If the second SR is downgraded to version 1, the state array of version 2 will change to (1, 0, 0, 1, 1), which causes the upgraded version to be downgraded. This is not what should be expected.

case 2

After the maintenance period is processed, an SR is added at the end of the SR list. The version of the newly added SR has an upgraded version but not the latest one. Afterward, when processing the newly added SR block, the state array of its version will be reset since the length of the array does not match the SR list length, causing the upgraded version to become a not upgraded one. The code looks like this:

  byte[] stats = manager.getDynamicPropertiesStore().statsByVersion(version);
  if (Objects.isNull(stats) || stats.length != witnesses.size()) {
    stats = new byte[witnesses.size()];
  }

State array access out of bounds

After the maintenance period is processed, an SR is added at the end of the SR list, and if the version of the newly added SR is not the latest version, then an exception of out-of-bounds access to the state array will occur during the downgrade process. As shown in the example in Figure 2. image A new SR5 is added, and its version is v2. When processing the SR5 version, the corresponding state in the state array of the version higher than v2 will be reset during the downgrade process. Since the length of the state array of v3 is 4, the array access would be out of bounds.

Historical versions cannot be upgraded

When dealing with block versions, when the version meets the upgrade conditions, the historical version lower than this version also needs to be upgraded, the code is as follows:

if (check(stats)) {
    upgrade(version, stats.length);
    return;
}

The check method is a full SR check. For example, if the state array of version v2 is (1, 1, 0, 1, 1), it has been upgraded, then the non-upgraded version lower than v2 should be upgraded, but the result is not what should be expected.

Specification

The upgraded version of our specification should not be allowed to be downgraded.

Implementation

For the problems described above, I propose the following solutions:

Percent Calculation Problem

When calculating the percentage of upgraded SRs, the denominator uses the size of state array.

Version Roll Back

case 1

During the downgrade process, use the pass method instead of the check method to determine whether the version has been upgraded. That is, it is judged whether the version is upgraded according to the rules corresponding to the version number.

case 2

The state array of the upgraded version is not allowed to be reset.

State array access out of bounds

In order to avoid a series of problems caused by different lengths of state arrays, the reset operation can be placed after the consensus module processing (new SR), which solves the problem of state array access out of bounds. The code modification looks like this:

    boolean flag = chainBaseManager.getDynamicPropertiesStore().getNextMaintenanceTime()
        <= block.getTimeStamp();
    if (flag) {
      proposalController.processProposals();
    }
    if (!consensus.applyBlock(block)) {
      throw new BadBlockException("consensus apply block failed");
    }
    if (flag) {
      chainBaseManager.getForkController().reset();
    }

Historical versions cannot be upgraded

Use the pass method (according to the rules corresponding to the version number to determine whether the version has been upgraded) to judge whether the version meets the upgrade condition, and upgrade the version that has not completed the upgrade history after the upgrade condition is met.