adapter: derive implications from catalog changes (rebased) #33999

ggevay · 2025-11-02T14:29:36Z

This is just @aljoscha's #29673 rebased. Gonna run a new Nightly.

There was a somewhat scary conflict resolution in sequence_create_source: #29673 deleted the closure of the catalog_transact_with_ddl_transaction call, and moved its functionality into handle_create_source, but in the meantime the closure was changed on main by 6dd815f (around the coord.catalog().get_entry(&item_id).latest_global_id(); line of this commit). I applied this same change to handle_create_source: copied the line from 6dd815f that calls latest_global_id(), made it use the item_id argument given to handle_create_source (this argument was previously unused), and made the IngestionDescription::new use the new item_global_id instead of source.global_id. I hope this is correct.

Nightly runs:

https://buildkite.com/materialize/nightly/builds/13958
https://buildkite.com/materialize/nightly/builds/13960 (just K8s recovery: envd on failing node with the serve_connection debug prints)
https://buildkite.com/materialize/nightly/builds/13961 (more debug prints in serve_connection)
https://buildkite.com/materialize/nightly/builds/13963 (trying to separate the debug prints of the compute controller from the storage controller)

Motivation

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

NOTE: This is the first of two commits. In here, we add the basic scaffolding for deriving implications from catalog changes, the parsing logic, all the plumbing. In the next commit we actually move code over from closures in the sequencer and the drop logic from ddl.rs. First step towards MaterializeInc/database-issues#8488 This change introduces a new architectural pattern for handling implications for in-memory state, controller changes, or anything else that might have to be done as a result of updating the catalog. Instead of directly updating state or instructing controllers within the sequencer logic (via closures) or in the drop logic of ddl.rs, we now derive these implications from catalog state updates and apply them in a separate phase. Key architectural changes: - Introduces `ParsedStateUpdate` that represent catalog changes that need to be applied - Adds a new `catalog_update_implications` module containing logic to derive and apply implications from consolidated catalog updates - Implements `apply_catalog_update_implications` that processes catalog updates and derives implications for items, clusters, and replicas, ongoing selects, subscribes, etc. - Adds consolidation logic to handle "fluctuating" updates (e.g., from DROP OWNED BY) where intermediate states cancel out, ensuring only net changes are applied - Moves update logic from inline closures in sequencer operations (particularly in `create_source_inner` and related methods) to the new centralized implication processing The refactoring specifically handles: - Table creation/deletion - Source creation/deletion - Storage policy initialization for newly created collections - Proper handling of temporary items alongside durable ones This approach provides better separation of concerns between catalog mutations and applying implications, making the system more maintainable and setting the foundation for eventually handling all implications through this pattern. Future work will extend this pattern to handle all catalog item types including indexes, materialized views, sinks, and other database objects.

NOTE: This is the second commit that moves over code from closures in sequencer and the drop logic from ddl.rs. First step towards MaterializeInc/database-issues#8488 This change introduces a new architectural pattern for handling implications for in-memory state, controller changes, or anything else that might have to be done as a result of updating the catalog. Instead of directly updating state or instructing controllers within the sequencer logic (via closures) or in the drop logic of ddl.rs, we now derive these implications from catalog state updates and apply them in a separate phase. Key architectural changes: - Introduces `ParsedStateUpdate` that represent catalog changes that need to be applied - Adds a new `catalog_update_implications` module containing logic to derive and apply implications from consolidated catalog updates - Implements `apply_catalog_update_implications` that processes catalog updates and derives implications for items, clusters, and replicas, ongoing selects, subscribes, etc. - Adds consolidation logic to handle "fluctuating" updates (e.g., from DROP OWNED BY) where intermediate states cancel out, ensuring only net changes are applied - Moves update logic from inline closures in sequencer operations (particularly in `create_source_inner` and related methods) to the new centralized implication processing The refactoring specifically handles: - Table creation/deletion - Source creation/deletion - Storage policy initialization for newly created collections - Proper handling of temporary items alongside durable ones This approach provides better separation of concerns between catalog mutations and applying implications, making the system more maintainable and setting the foundation for eventually handling all implications through this pattern. Future work will extend this pattern to handle all catalog item types including indexes, materialized views, sinks, and other database objects.

Picking the op_id, as before, seems bogus, because this is just an increasing number within the transaction, and so the temporary updates wouldn't be sorted in one batch with other updates when chunking up durable and temporary catalog updates.

…ment Ord,Eq

When applying updates that are derived from catalog changes to the controllers we want the updates to be consolidated. Before we were doing that right on the parsed controller updates, but we had to implement bespoke Ord/Eq that would ignore the in-memory fields. Now, we sort earlier in the transact/apply "pipeline", right when we have the StateUpdates. This allows us to remove the bespoke Ord/Eq impls, but comes with other downsides as explained in code comments.

which ... temporary sources are not a thing, but the types support it and this way the code is the same as for non-temp items

This reverts commit 17f22ef.

…s not only about controllers

ggevay · 2025-11-02T16:30:36Z

Looking into the Nightly failures:

K8s recovery failed again, but it also failed on main here. On main it passed after a retry, so retrying also here. Edit: Failed on a retry too. Retrying one more time. Edit 2: I just noticed that on main a different K8s recovery test failed, not the envd one.
- Note that the Validation failed in try 11/33 with isolation STRICT SERIALIZABLE, retrying. message is there also in the last successful run on main, so this is not the issue.
- The test actually fails with a timeout, so something is taking more time than usual. The last several runs of K8s recovery: envd on failing node on main were all successful with a run time of 26-27 minutes, so the timeout after 1h 3m is a significant run time increase.
  - ~~I'm wondering if this is related to the slight DDL regression that is expected from this PR.~~ Probably not, because there is only a little DDL in this test. It doesn't seem to do DDL in a loop. (Also, Scalability benchmark (DDL) against merge base or 'latest' failure shows only around a 20% regression, which seems to be much less than the more than 2x slowdown here.)
  - I'm not sure how we could figure out which part is slower, because I don't see timestamps in run.log. Edit: There is actually a "Show timestamps" button!!
    - In the new retry, we are at Validation failed in try 9/33 with isolation STRICT SERIALIZABLE, retrying. after 22 minutes. (This is the retry loop in def validate_state.)
    - We are here after 39 minutes (the next log msg in a successful run should be something like Succeeded to achieve 'node recovery and live data' within 349.0 seconds (limit: 660s), where the within is fairly stable between 340-360):

Validation failed in try 16/33 with isolation STRICT SERIALIZABLE, retrying.
waiting for kubectl wait --for condition=Ready pod/testdrive --timeout 300s --context kind-mzcloud --namespace default ... success!
error: error connecting to server: Connection refused (os error 111): Connection refused (os error 111)
Validation failed in try 17/33 with isolation STRICT SERIALIZABLE, retrying.
Unable to use a TTY - input is not a terminal or the right kind of file
!!! Error Report
1 errors were encountered during execution
files involved: -
command terminated with exit code 1
waiting for kubectl wait --for condition=Ready pod/testdrive --timeout 300s --context kind-mzcloud --namespace default ... success!

We are still at the same spot after 48m. Edit: We timed out at this spot. Edit 2: I don't understand how can we be at the exact same try 17/33 for more than 20 minutes. Shouldn't the retry loop in def validate_state go on the the next try or give up?

kubectl-get-all.log has some further info: envd is not able to connect to most of the replicas (but it is able to connect to s2):

[pod/environmentd-0/environmentd] 2025-11-02T18:34:06.183693Z  INFO mz_service::transport: ctp: connected to server address=cluster-u3-replica-u4-gen-0-0.cluster-u3-replica-u4-gen-0.default.svc.cluster.local:2101
[pod/environmentd-0/environmentd] 2025-11-02T18:34:06.239656Z  INFO mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=s3 next_backoff=1s
[pod/environmentd-0/environmentd] 2025-11-02T18:34:06.240320Z  INFO mz_service::transport: ctp: connected to server address=cluster-s3-replica-s3-gen-0-0.cluster-s3-replica-s3-gen-0.default.svc.cluster.local:2101
[pod/environmentd-0/environmentd] 2025-11-02T18:34:06.250730Z  INFO mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=s1 next_backoff=1s
[pod/environmentd-0/environmentd] 2025-11-02T18:34:06.251416Z  INFO mz_service::transport: ctp: connected to server address=cluster-s1-replica-s1-gen-0-0.cluster-s1-replica-s1-gen-0.default.svc.cluster.local:2101
[pod/environmentd-0/environmentd] 2025-11-02T18:34:06.266643Z  INFO mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=u2 next_backoff=1s
[pod/environmentd-0/environmentd] 2025-11-02T18:34:06.267311Z  INFO mz_service::transport: ctp: connected to server address=cluster-u1-replica-u2-gen-0-0.cluster-u1-replica-u2-gen-0.default.svc.cluster.local:2101
[pod/environmentd-0/environmentd] 2025-11-02T18:34:06.278518Z  INFO mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=u3 next_backoff=1s
[pod/environmentd-0/environmentd] 2025-11-02T18:34:06.279141Z  INFO mz_service::transport: ctp: connected to server address=cluster-u2-replica-u3-gen-0-0.cluster-u2-replica-u3-gen-0.default.svc.cluster.local:2101
[pod/environmentd-0/environmentd] 2025-11-02T18:34:09.088920Z  WARN mz_server_core: error handling connection: connection closed server="pgwire" conn_uuid=019a45b8-5d12-7fa2-b4f1-e90c2ffef144
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.185569Z  INFO mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=u4 next_backoff=1s
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.186263Z  INFO mz_service::transport: ctp: connected to server address=cluster-u3-replica-u4-gen-0-0.cluster-u3-replica-u4-gen-0.default.svc.cluster.local:2101
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.241526Z  INFO mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=s3 next_backoff=1s
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.242150Z  INFO mz_service::transport: ctp: connected to server address=cluster-s3-replica-s3-gen-0-0.cluster-s3-replica-s3-gen-0.default.svc.cluster.local:2101
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.252164Z  INFO mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=s1 next_backoff=1s
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.252722Z  INFO mz_service::transport: ctp: connected to server address=cluster-s1-replica-s1-gen-0-0.cluster-s1-replica-s1-gen-0.default.svc.cluster.local:2101
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.268305Z  INFO mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=u2 next_backoff=1s
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.268945Z  INFO mz_service::transport: ctp: connected to server address=cluster-u1-replica-u2-gen-0-0.cluster-u1-replica-u2-gen-0.default.svc.cluster.local:2101
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.280138Z  INFO mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=u3 next_backoff=1s
[pod/environmentd-0/environmentd] 2025-11-02T18:34:11.280672Z  INFO mz_service::transport: ctp: connected to server address=cluster-u2-replica-u3-gen-0-0.cluster-u2-replica-u3-gen-0.default.svc.cluster.local:2101

Can this be related to this PR? Maybe we are trying to connect to the replicas at wrong addresses due to something broken with how the new code drives the compute controller? Edit: Or, does ctp: connected to server address=cluster-s3-replica-s3-gen-0-0.cluster-s3-replica-s3-gen-0.default.svc.cluster.local:2101 mean that the connection itself succeeded? But then what exactly the mz_compute_client::controller::replica: error connecting to replica: timed out replica_id=s3 next_backoff=1s means?

Edit: Using the "Show timestamps" button it's very clearly visible that all the runs hang at this same point for 30+ minutes. It might be a test issue that the test hangs there, instead of some specific operation (e.g., a specific Testdrive query) timing out and failing the test. But the above replica connection issue is probably the real problem, independently from the test issue. Slack thread.

Other failures:

(Parallel Workload (0dt deploy) also failed on main a few days ago with the same error here, but passed after a retry. Retrying also here. Edit: Passed on a retry.)
The Zippy User Tables failure might be new. Retrying for now. Similar failures seem to have happened only a long time ago. (on main only in 2024) Edit: Well, passed on a retry. I'm hitting retry a few more times to get a feel of how often the failure happens. Maybe it is a very rare flake that already existed before this PR. Edit: It passed 3 times in a row now.
The Scalability benchmark (DDL) against merge base or 'latest' regression is expected, as discussed here.

aljoscha and others added 29 commits November 2, 2025 14:06

adapter: handle TemporaryTable like other items, so that we can imple…

475031b

…ment Ord,Eq

fixup! optimize preliminary state handling

2757c3f

move around imports, rename parse method

fedebb8

pass StateUpdate instead of unwrapping

44278d8

remove WIP, add explanation

17233d9

fix parsing of connections for temporary sources

99ae0f8

which ... temporary sources are not a thing, but the types support it and this way the code is the same as for non-temp items

remove code duplication between temp and non-temp item parsing

6c0126a

remove extra _diff parameters

8aabe29

nit: rename variable

87f43da

fix comment

1eb3fe7

remove WIP comments about consistency checks

cd8b548

remove duplicate catalog_transact_conn

f69f930

remove View handling from controller_commands

674fc7e

Revert "remove View handling from controller_commands"

35907dd

This reverts commit 17f22ef.

remove unused _ts argument

f0f0bd6

also pass in name for continual task

25f9152

update unreachable message

c7632ad

fix comment around dropping secrets and replication slots

b8e80ed

update verbiage around catalog cloning in transact.rs

8014ef9

fix docstring on ControllerCommand

0d8baa4

document controller_apply_catalog_updates

0196416

some more rustdoc on parsed_state_updates

8e0fddc

rename back to parse_state_update to reflect the real nature that it'…

8ebdcb9

…s not only about controllers

adapter: rename controller_commands to apply_implications

f9f6262

cargo fmt

3c740bd

ggevay added the A-ADAPTER Topics related to the ADAPTER layer label Nov 2, 2025

ggevay mentioned this pull request Nov 2, 2025

adapter: derive implications from catalog changes #29673

Open

ggevay added 3 commits November 3, 2025 10:31

gl DNM gl Add debug prints to serve_connection

9ef6bbb

gs DNM gs more debug prints

a60a3e1

gl! DNM gl! Also print IN/OUT types

574bb36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

adapter: derive implications from catalog changes (rebased) #33999

adapter: derive implications from catalog changes (rebased) #33999

Uh oh!

ggevay commented Nov 2, 2025 •

edited

Loading

Uh oh!

ggevay commented Nov 2, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adapter: derive implications from catalog changes (rebased) #33999

Are you sure you want to change the base?

adapter: derive implications from catalog changes (rebased) #33999

Uh oh!

Conversation

ggevay commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Tips for reviewer

Checklist

Uh oh!

ggevay commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggevay commented Nov 2, 2025 •

edited

Loading

ggevay commented Nov 2, 2025 •

edited

Loading