Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Resolved the cache invalidation of the partition->leader shard in ClientCache #2576

Closed
1 task done
haohao0103 opened this issue Jul 12, 2024 · 1 comment · Fixed by #2588
Closed
1 task done
Labels
bug Something isn't working rocksdb RocksDB backend

Comments

@haohao0103
Copy link
Contributor

Bug Type (问题类型)

other exception / error (其他异常报错)

Before submit

  • 我已经确认现有的 IssuesFAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

  • Server Version: 1.5.0 (Apache Release Version)
  • Backend: RocksDB 5 nodes, SSD

Fix the cache invalidation issue of partition->leader shard in the clientcache.

Set the initialization flag to false after resetting the cache, allowing reinitialization;
If the cache misses, the cache is updated with the results of the query from the PD.

The main reason for the problem is that when the leader changes in the PD cluster, the cache is cleared, but the correct metadata is not put into the cache after the query, and the cache flag remains true and the initialization operation is not executed again

Expected & Actual behavior (期望与实际表现)

---ts=2024-04-29 14:07:03;thread_name=grizzly-http-server-25;id=86;is_daemon=false;priority=5;TCCL=jdk.internal.loader.ClassLoaders$AppClassLoader@9e89d68 ---[2.22586ms] org.apache.hugegraph.store.client.NodeTxExecutor:doAction()
+---[0.02% 4.44E-4ms ] org.apache.hugegraph.store.term.HgTriple:getZ() #238
+---[0.02% 4.64E-4ms ] org.apache.hugegraph.store.term.HgTriple:getX() #239
+---[0.03% min=2.99E-4ms,max=3.36E-4ms,total=6.35E-4ms,count=2] org.apache.hugegraph.store.term.HgTriple:getY() #239
`---[99.69% 2.218854ms ] org.apache.hugegraph.store.client.NodeTxSessionProxy:doAction() #239

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

@haohao0103 haohao0103 added the bug Something isn't working label Jul 12, 2024
@dosubot dosubot bot added the rocksdb RocksDB backend label Jul 12, 2024
Copy link

dosubot bot commented Jul 12, 2024

To resolve the cache invalidation issue of the partition->leader shard in ClientCache, you need to ensure that the cache is properly reset and reinitialized when the leader changes in the PD cluster. Here are the key steps to address this:

  1. Set Initialization Flag to False: After resetting the cache, set the initialization flag to false to allow reinitialization.
  2. Update Cache on Miss: If the cache misses, update the cache with the results of the query from the PD.

The provided code already includes mechanisms to handle leader change events and invalidate the cache:

storeWatcher = pdWatch.watchNode(new PDWatch.Listener<>() {
    @Override
    public void onNext(NodeEvent response) {
        log.info("PDClient receive store event {} {}",
                 response.getEventType(), Long.toHexString(response.getNodeId()));

        if (response.getEventType() == NODE_PD_LEADER_CHANGE) {
            // pd raft change
            var leaderIp = response.getGraph();
            log.info("watchNode: pd leader changed to {}, current watch:{}",
                     leaderIp, pdWatch.getCurrentHost());
            closeStub(!Objects.equals(pdWatch.getCurrentHost(), leaderIp));
            connectPdWatch(leaderIp);
        }

        invalidStoreCache(response.getNodeId());
        eventListeners.forEach(listener -> {
            listener.onStoreChanged(response);
        });
    }

    @Override
    public void onError(Throwable throwable) {
        log.error("watchNode exception {}", throwable.getMessage());
        closeStub(false);
    }
});

Additionally, the updatePartitionLeader method is used to update the cache when the leader changes:

public void updatePartitionLeader(String graphName, int partId, long leaderStoreId) {
    KVPair<Metapb.Partition, Metapb.Shard> partShard = null;
    try {
        partShard = this.getPartitionById(graphName, partId);

        if (partShard != null && partShard.getValue().getStoreId() != leaderStoreId) {
            var shardGroup = this.getShardGroup(partId);
            Metapb.Shard shard = null;
            List<Metapb.Shard> shards = new ArrayList<>();

            for (Metapb.Shard s : shardGroup.getShardsList()) {
                if (s.getStoreId() == leaderStoreId) {
                    shard = s;
                    shards.add(Metapb.Shard.newBuilder(s)
                                           .setStoreId(s.getStoreId())
                                           .setRole(Metapb.ShardRole.Leader).build());
                } else {
                    shards.add(Metapb.Shard.newBuilder(s)
                                           .setStoreId(s.getStoreId())
                                           .setRole(Metapb.ShardRole.Follower).build());
                }
            }

            if (config.isEnableCache()) {
                if (shard == null) {
                    cache.removePartition(graphName, partId);
                }
            }
        }
    } catch (PDException e) {
        log.error("getPartitionException: {}", e.getMessage());
    }
}

Ensure these methods are correctly invoked and the cache is properly configured to handle these updates. This will help manage the cache invalidation issue effectively [1][2][3].

To continue talking to Dosu, mention @dosu.

haohao0103 added a commit to haohao0103/incubator-hugegraph that referenced this issue Jul 12, 2024
Resolved the cache invalidation of the partition->leader shard in ClientCache
haohao0103 added a commit to haohao0103/incubator-hugegraph that referenced this issue Jul 15, 2024
Resolve the cache invalidation of the partition->leader shard in ClientCache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rocksdb RocksDB backend
Projects
None yet
1 participant