Skip to content

Conversation

@LuDiting
Copy link

In the current implementation, if there is a lagging follower node, the corresponding progress state in the Leader node is Probe. This state will not be released in handle_append_response until next new write.

However, there is a situation where when there is an unreachable node and the upper-level application calls report_unreachable, this node will be marked as Probe by the Leader. When this node rejoins the cluster, it is still in Probe state. If the cluster does not have any new writes during this period, this node is actually Replicate. But its Probe state will not be updated until next new write.

This has no effect on consistency, but some operations based on the progress state of the node may be involved in some upper-level applications. Therefore, it is believed that an operation can be added in handle_heartbeat_response to set the progress of the node that keeps up with the progress to replicate state.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 17, 2025

Welcome @LuDiting! It looks like this is your first PR to tikv/raft-rs 🎉

@LuDiting LuDiting force-pushed the update_pr_when_handling_heartbeat_resp branch from 1707ac8 to d21b123 Compare January 17, 2025 14:18
@BusyJay
Copy link
Member

BusyJay commented Jan 20, 2025

Make sense. I see upstream has already fix this issue in etcd-io/raft#52, how about just porting the patch instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants