Skip to content

Conversation

@phacops
Copy link
Contributor

@phacops phacops commented Nov 19, 2025

We have instances where we want to allow for a bigger query size. This will add 2 settings to customize if we enforce query size for a given cluster or not:

  • ExecutionStage.disable_max_query_size_check_for_clusters: disable checking for max query size, leaving it to ClickHouse to reject the query
  • ExecutionStage.max_query_size_bytes: customize the global query size limit Snuba enforces

@phacops phacops requested a review from a team as a code owner November 19, 2025 22:26
Comment on lines 228 to 230
if (
not cluster_name or cluster_name not in disable_max_query_size_check_for_clusters
) and query_size_bytes > _max_query_size_bytes():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Substring matching used instead of list membership for disable_max_query_size_check_for_clusters config.
Severity: CRITICAL | Confidence: High

🔍 Detailed Analysis

The in operator on line 229 performs Python substring matching instead of checking for membership in a list of cluster names. If disable_max_query_size_check_for_clusters is configured as a comma-separated string (e.g., "test_cluster,prod_cluster"), a cluster name like "test" will incorrectly match as a substring, leading to the max query size check being disabled when it should not be. This bypasses a critical safety mechanism.

💡 Suggested Fix

Split the disable_max_query_size_check_for_clusters string by comma into a list of cluster names, then check for cluster_name membership in that list.

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: snuba/pipeline/stages/query_execution.py#L228-L230

Potential issue: The `in` operator on line 229 performs Python substring matching
instead of checking for membership in a list of cluster names. If
`disable_max_query_size_check_for_clusters` is configured as a comma-separated string
(e.g., "test_cluster,prod_cluster"), a cluster name like "test" will incorrectly match
as a substring, leading to the max query size check being disabled when it should not
be. This bypasses a critical safety mechanism.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference_id: 2826178

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change should take a couple minutes and is worth making

Comment on lines +166 to +170
def _max_query_size_bytes() -> int:
return (
state.get_int_config(MAX_QUERY_SIZE_BYTES_CONFIG, MAX_QUERY_SIZE_BYTES)
or MAX_QUERY_SIZE_BYTES
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: _max_query_size_bytes() incorrectly replaces 0 config value with default due to falsy evaluation.
Severity: HIGH | Confidence: High

🔍 Detailed Analysis

In _max_query_size_bytes(), when MAX_QUERY_SIZE_BYTES_CONFIG is explicitly set to 0, state.get_int_config() correctly returns 0. However, the subsequent or MAX_QUERY_SIZE_BYTES clause evaluates 0 as falsy, causing the default MAX_QUERY_SIZE_BYTES to be used instead of the configured 0. This prevents 0 from being a valid configuration value, limiting flexibility.

💡 Suggested Fix

Replace the or operator with an explicit is not None check to correctly handle 0 as a valid configuration value. For example, return config_value if config_value is not None else MAX_QUERY_SIZE_BYTES.

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: snuba/pipeline/stages/query_execution.py#L166-L170

Potential issue: In `_max_query_size_bytes()`, when `MAX_QUERY_SIZE_BYTES_CONFIG` is
explicitly set to `0`, `state.get_int_config()` correctly returns `0`. However, the
subsequent `or MAX_QUERY_SIZE_BYTES` clause evaluates `0` as falsy, causing the default
`MAX_QUERY_SIZE_BYTES` to be used instead of the configured `0`. This prevents `0` from
being a valid configuration value, limiting flexibility.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference_id: 2826178

@codecov
Copy link

codecov bot commented Nov 19, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
1130 1 1129 8
View the top 1 failed test(s) by shortest run time
tests.pipeline.test_execution_stage::test_disable_max_query_size_check
Stack Traces | 0.648s run time
Traceback (most recent call last):
  File "/.venv/lib/python3.11........./site-packages/_pytest/runner.py", line 341, in from_call
    result: TResult | None = func()
                             ^^^^^^
  File "/.venv/lib/python3.11........./site-packages/_pytest/runner.py", line 242, in <lambda>
    lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11....../site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11....../site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 182, in _multicall
    return outcome.get_result()
           ^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11.../site-packages/pluggy/_result.py", line 100, in get_result
    raise exc.with_traceback(exc.__traceback__)
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11....../site-packages/_pytest/threadexception.py", line 92, in pytest_runtest_call
    yield from thread_exception_runtest_hook()
  File "/.venv/lib/python3.11....../site-packages/_pytest/threadexception.py", line 68, in thread_exception_runtest_hook
    yield
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11....../site-packages/_pytest/unraisableexception.py", line 95, in pytest_runtest_call
    yield from unraisable_exception_runtest_hook()
  File "/.venv/lib/python3.11....../site-packages/_pytest/unraisableexception.py", line 70, in unraisable_exception_runtest_hook
    yield
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11....../site-packages/_pytest/logging.py", line 846, in pytest_runtest_call
    yield from self._runtest_for(item, "call")
  File "/.venv/lib/python3.11....../site-packages/_pytest/logging.py", line 829, in _runtest_for
    yield
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11.../site-packages/_pytest/capture.py", line 880, in pytest_runtest_call
    return (yield)
            ^^^^^
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11.../site-packages/_pytest/skipping.py", line 257, in pytest_runtest_call
    return (yield)
            ^^^^^
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11........./site-packages/_pytest/runner.py", line 174, in pytest_runtest_call
    item.runtest()
  File "/.venv/lib/python3.11....../site-packages/_pytest/python.py", line 1627, in runtest
    self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  File "/.venv/lib/python3.11....../site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11....../site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11....../site-packages/_pytest/python.py", line 159, in pytest_pyfunc_call
    result = testfunction(**testargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../tests/pipeline/test_execution_stage.py", line 279, in test_disable_max_query_size_check
    assert res.data
AssertionError: assert None
 +  where None = QueryPipelineResult(data=None, error={\n  "__type__": "SerializableException",\n  "__name__": "QueryException",\n  "__message__": "After processing, query is 177 bytes, which is too long for ClickHouse to process. Max size is 1 bytes.",\n  "__should_report__": true,\n  "__extra_data__": {\n    "extra": {\n      "stats": {\n        "clickhouse_table": "transactions_local",\n        "final": false,\n        "referrer": "blah",\n        "sample": null,\n        "cluster_name": "cluster_one_sh"\n      },\n      "sql": "SELECT (avg((duration AS _snuba_duration)) AS `_snuba_avg(duration)`) FROM transactions_local WHERE equals(transaction_name, 'dog') AND equals(project_id, 1) LIMIT 1000 OFFSET 0",\n      "experiments": {}\n    }\n  }\n}, query_settings=<snuba.query.query_settings.HTTPQuerySettings object at 0x7f3436d0aad0>, timer=<snuba.utils.metrics.timer.Timer object at 0x7f34355156d0>).data

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

Copy link
Contributor

@onewland onewland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly fine but I agree with the AI that you should have some delimiter for the clusters where size checking is disabled and look for exact matches

Comment on lines 228 to 230
if (
not cluster_name or cluster_name not in disable_max_query_size_check_for_clusters
) and query_size_bytes > _max_query_size_bytes():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change should take a couple minutes and is worth making

@phacops phacops requested a review from onewland November 20, 2025 19:31
@phacops phacops enabled auto-merge (squash) November 20, 2025 19:35
@phacops phacops merged commit 9654ebf into master Nov 20, 2025
33 checks passed
@phacops phacops deleted the pierre/eap-add-runtime-config-for-query-size branch November 20, 2025 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants