Skip to content

Conversation

@henrybear327
Copy link
Contributor

@henrybear327 henrybear327 commented Mar 20, 2025

Instead of marking the return time with -1 during operation data collection phase, we patch the failed request's response time during validation so we get to preserve the original data, and we get to avoid the problem we see mentioned in issue #19579.

Reference: #19579

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: henrybear327
Once this PR has been reviewed and has the lgtm label, please assign serathius for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@henrybear327
Copy link
Contributor Author

henrybear327 commented Mar 20, 2025

@serathius for the record, the robustness test that I ran locally over lunch passed.

As discussed on Slack, I will now:

  • select some of the past issues, e.g. test-robustness-issue18089 to validate if it still fails as expected
  • check if the visualization looks ok by force-generating it

If all is good, we modify the logic to not set the return value to -1, but during validation when there is an error contained in the operation, we modify it to max int so we can preserve the original response time.

@henrybear327
Copy link
Contributor Author

henrybear327 commented Mar 20, 2025

test-robustness-issue17780, test-robustness-issue18089, and test-robustness-issue19179 are successfully reproduced.

@henrybear327
Copy link
Contributor Author

@henrybear327 henrybear327 changed the title [DO NOT REVIEW] Patch the return time with MaxInt64 in robustness test Patch the return time with MaxInt64 in robustness test Mar 21, 2025
@henrybear327 henrybear327 requested a review from serathius March 21, 2025 10:10
@henrybear327 henrybear327 marked this pull request as draft March 21, 2025 10:19

func (h History) Operations() []porcupine.Operation {
operations := make([]porcupine.Operation, 0, len(h.operations))
maxTime := h.lastObservedTime()
Copy link
Contributor Author

@henrybear327 henrybear327 Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serathius Because of this removal, the unit tests would have to be updated, due to this pre-condition failing.

But if I am not mistaken, each client is sending the requests sequentially, thus, the requests in the tests can be non-overlapping / sequentially executed requests, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updating unit tests here makes sense.

@henrybear327 henrybear327 requested a review from serathius March 21, 2025 13:15
@henrybear327 henrybear327 force-pushed the issue/19579 branch 2 times, most recently from 968ab20 to 8ee6e98 Compare March 21, 2025 14:44
func uniquePutReturnTime(allOperations []porcupine.Operation, reports []report.ClientReport, persistedRequests []model.EtcdRequest, clientPutCount map[keyValue]int64) map[keyValue]int64 {
func uniquePutReturnTime(allOperations []porcupine.Operation, reports []report.ClientReport, clientPutCount map[keyValue]int64) map[keyValue]int64 {
earliestReturnTime := map[keyValue]int64{}
var lastReturnTime int64
Copy link
Contributor Author

@henrybear327 henrybear327 Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serathius can you help me understand what's the purpose of this variable? :) I am not sure what case is this variable trying to cover after reading through and experimenting with it during bug fixing.

Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not expecting need to change this code. Please revert

@henrybear327 henrybear327 marked this pull request as ready for review March 21, 2025 14:55
@henrybear327 henrybear327 changed the title Patch the return time with MaxInt64 in robustness test Patch the return time during validation with MaxInt64 in robustness test Mar 21, 2025
@codecov
Copy link

codecov bot commented Mar 21, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.78%. Comparing base (29878cf) to head (a1c249b).
Report is 8 commits behind head on main.

Additional details and impacted files

see 58 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #19633      +/-   ##
==========================================
+ Coverage   64.12%   68.78%   +4.65%     
==========================================
  Files         421      421              
  Lines       35854    35854              
==========================================
+ Hits        22993    24661    +1668     
+ Misses      11443     9764    -1679     
- Partials     1418     1429      +11     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e27246...a1c249b. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@henrybear327
Copy link
Contributor Author

/retest

operations []porcupine.Operation,
timeout time.Duration,
) (results Results) {
operations = patchResponseTimeOfFailedOperations(operations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should happen before we patch operations, because then patching will not have opportunity to modify the return time.

persistedPutCount := countPersistedPuts(persistedRequests)
clientPutCount := countClientPuts(reports)
putReturnTime := uniquePutReturnTime(allOperations, reports, persistedRequests, clientPutCount)
putReturnTime := uniquePutReturnTime(allOperations, reports, clientPutCount)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert

@henrybear327
Copy link
Contributor Author

Discussed offline, we will start a new PR with minimal changes and improved append validation rules.

@henrybear327 henrybear327 deleted the issue/19579 branch March 27, 2025 09:56
@henrybear327 henrybear327 changed the title Patch the return time during validation with MaxInt64 in robustness test [Robustness test] Patch the return time during validation with MaxInt64 in robustness test Mar 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants