Skip to content

Add a configuration option to set the WAL check period in scaled down mode #3260

@alco

Description

@alco

Just to avoid Electric going to sleep forever, we should have it wake up periodically and make sure there's no pathological WAL buildup over time.

This can be implemented as a timer on the Electric.Connection.Restarter process that counts down the time until the next wakeup. The period should be configurable via a new config option named, say, ELECTRIC_REPLICATION_WAL_SIZE_CHECK_PERIOD.

Currently, connection manager validates database connection options (separately for the direct connection and for the pooled connection) and stores the validated ones in its state. One downside of this is that revalidation happens every time the Manager restarts. The other downside is that when the Manager itself is down, the Restarter process won't have access to validated connection options.

This part needs to be refactored such that validated connection options are persistent in StackConfig.

We should also build a utility wrapper around Postgrex.SimpleConnection to be used for one-off queries, such as terminating the backend holding onto the lock, validating connection options and fetching the WAL size reserved by the replication slot:

#
## LockBreakerConnection
#
"WITH inactive_slots AS (..."
|> Electric.OneOffQuery.async(pool_connection_opts, max_reconnects: 0)

#
## WAL check
#
async_query = 
  {"""
      SELECT
        pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)::int8
      FROM
        pg_replication_slots
      WHERE
        slot_name = $1
  """, [slot_name]}
  |> Electric.OneOffQuery.async(pool_connection_opts, max_reconnects: 1, reconnect_backoff: 500)

case Electric.OneOffQuery.await(async_query) do
  {:ok, result} -> ...
  {:error, reason} -> ...
end

#
## ConnectionResolver
#
# We're not interested in running any queries here. Just want to make sure a connection
# can be established using the provided connection opts.
Electric.OneOffQuery.connect(pool_connection_opts, max_reconnects: 1, reconnect_backoff: 500)
|> case do
  {:ok, valited_conn_opts} -> ...
  {:error, reason} -> ...
end

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions