Skip to content

Conversation

@chyezh
Copy link
Contributor

@chyezh chyezh commented Nov 26, 2025

issue: #45865

@chyezh chyezh added this to the 2.6.7 milestone Nov 26, 2025
@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: chyezh
To complete the pull request process, please assign jiaoew1991 after the PR has been reviewed.
You can assign the PR to them by writing /assign @jiaoew1991 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot added the size/S Denotes a PR that changes 10-29 lines. label Nov 26, 2025
@mergify mergify bot added dco-passed DCO check passed. kind/bug Issues or changes related a bug labels Nov 26, 2025
@sre-ci-robot
Copy link
Contributor

[ci-v2-notice]
Notice: We are gradually rolling out the new ci-v2 system.

  • Legacy CI jobs remain unaffected, you can just ignore ci-v2 if you don't want to run it.
  • Additional "ci-v2/*" checkers will run for this PR to ensure the new ci-v2 system is working as expected.
  • For tests that exist in both v1 and v2, passing in either system is considered PASS.

To rerun ci-v2 checks, comment with:

  • /ci-rerun-code-check // for ci-v2/code-check
  • /ci-rerun-build // for ci-v2/build
  • /ci-rerun-ut-integration // for ci-v2/ut-integration
  • /ci-rerun-ut-go // for ci-v2/ut-go
  • /ci-rerun-ut-cpp // for ci-v2/ut-cpp
  • /ci-rerun-ut // for all ci-v2/ut-integration, ci-v2/ut-go, ci-v2/ut-cpp
  • /ci-rerun-e2e-arm // for ci-v2/e2e-arm [master branch only]
  • /ci-rerun-e2e-default // for ci-v2/e2e-default [master branch only]

If you have any questions or requests, please contact @zhikunyao.

@codecov
Copy link

codecov bot commented Nov 26, 2025

Codecov Report

❌ Patch coverage is 47.36842% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.18%. Comparing base (6c0a80d) to head (6687802).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
internal/querycoordv2/task/executor.go 0.00% 7 Missing ⚠️
internal/querycoordv2/task/scheduler.go 75.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #45877      +/-   ##
==========================================
+ Coverage   76.13%   76.18%   +0.05%     
==========================================
  Files        1869     1881      +12     
  Lines      292340   294241    +1901     
==========================================
+ Hits       222576   224171    +1595     
- Misses      62378    62631     +253     
- Partials     7386     7439      +53     
Components Coverage Δ
Client 78.17% <ø> (ø)
Core 82.75% <ø> (ø)
Go 74.30% <47.36%> (+0.03%) ⬆️
Files with missing lines Coverage Δ
internal/querycoordv2/task/scheduler.go 72.81% <75.00%> (-0.65%) ⬇️
internal/querycoordv2/task/executor.go 60.61% <0.00%> (-0.55%) ⬇️

... and 38 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chyezh
Copy link
Contributor Author

chyezh commented Nov 26, 2025

/ci-rerun-ut-go

@mergify mergify bot added the ci-passed label Nov 26, 2025
log.Warn("node doesn't belong to any replica", zap.Error(err))
return
}
view := ex.dist.ChannelDistManager.GetShardLeader(task.Shard(), replica)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to verify replicaID is same as we execute?

leader := scheduler.distMgr.ChannelDistManager.GetShardLeader(task.Shard(), task.replica)
leader := scheduler.getReplicaShardLeader(task.Shard(), task.ReplicaID())
if leader == nil {
return merr.WrapErrServiceInternal("segment's delegator leader not found, stop balancing")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check the target node is still under the replica?

probably it's safe to lock and drain all the balance task before replica change? currently there seems to be many corner cases to handle?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we don't want replica mutual exclusive to balance, it might at least to check the replica exist and node belongs to replica at preProcess?

// wait for new delegator becomes leader, then try to remove old leader
task := task.(*ChannelTask)
delegator := scheduler.distMgr.ChannelDistManager.GetShardLeader(task.Shard(), task.replica)
delegator := scheduler.getReplicaShardLeader(task.Shard(), task.ReplicaID())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a possibility that the node does not belong to the replica anymore?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-passed dco-passed DCO check passed. kind/bug Issues or changes related a bug size/S Denotes a PR that changes 10-29 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants