fix(export): CSV export for actor's query fails with postgres error #41879

lshaowei18 · 2025-11-21T09:29:25Z

Problem

Changes

My suspicion is that the cursor.fetchall for distinct_ids is fetching too much data at one go causing the db connection(?) to go out of memory?

This approach might work for now, but eventually the python array will likely face an memory error.

As mentioned in #22519 (comment), I think we should consider not returning the distinct_ids for CSV exports, or limit the number of distinct_ids that we return.

Listing down the possible solutions I can think of:

Don't include distinct id for person modal csv exports
Don't include distinct id for some queries that have more than certain amount of actors
Include distinct ids, but set a limit to the number of total distinct ids we return

How did you test this code?

I tested that the export still works, but I only have 200+ person in my local so definitely not testing well enough.

Does anyone have any idea if I can setup my demo data easily to test this kind of volume, or write unit/integration tests for this?

lshaowei18 · 2025-11-21T09:31:36Z

posthog/hogql_queries/actor_strategies.py

            )
-            distinct_ids = cursor.fetchall()
+            distinct_ids = []
+            batch_size = 10000


I think we might be able to increase this batch_size more aggressively?

greptile-apps

_{1 file reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

andyzzhao · 2025-11-21T10:10:32Z

Thank you for your PR @lshaowei18. Logs in production show that these queries are timing out, so batching the query makes sense to me

Logs

Traceback (most recent call last):
  File \"/python-runtime/lib/python3.12/site-packages/django/db/backends/utils.py\", line 89, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/python-runtime/lib/python3.12/site-packages/opentelemetry/instrumentation/psycopg/__init__.py\", line 367, in execute
    return _cursor_tracer.traced_execution(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/python-runtime/lib/python3.12/site-packages/opentelemetry/instrumentation/dbapi/__init__.py\", line 593, in traced_execution
    return query_method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/python-runtime/lib/python3.12/site-packages/psycopg/cursor.py\", line 97, in execute
    raise ex.with_traceback(None)
psycopg.errors.ProtocolViolation: query timeout

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File \"/python-runtime/lib/python3.12/site-packages/rest_framework/views.py\", line 512, in dispatch
    response = handler(request, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/code/posthog/api/monitoring.py\", line 44, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File \"/code/posthog/api/query.py\", line 144, in create
    result = process_query_model(
             ^^^^^^^^^^^^^^^^^^^^
  File \"/code/posthog/api/services/query.py\", line 197, in process_query_model
    result = query_runner.run(
             ^^^^^^^^^^^^^^^^^
  File \"/code/posthog/hogql_queries/query_runner.py\", line 1164, in run
    query_result = self.calculate()
                   ^^^^^^^^^^^^^^^^
  File \"/code/posthog/hogql_queries/query_runner.py\", line 1463, in calculate
    response = self._calculate()
               ^^^^^^^^^^^^^^^^^
  File \"/code/posthog/hogql_queries/actors_query_runner.py\", line 194, in _calculate
    return self._calculate_internal()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/code/posthog/hogql_queries/actors_query_runner.py\", line 148, in _calculate_internal
    actors_lookup = self.strategy.get_actors(actor_ids)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/code/posthog/hogql_queries/actor_strategies.py\", line 71, in get_actors
    cursor.execute(
  File \"/python-runtime/lib/python3.12/site-packages/django/db/backends/utils.py\", line 67, in execute
    return self._execute_with_wrappers(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/python-runtime/lib/python3.12/site-packages/django/db/backends/utils.py\", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/python-runtime/lib/python3.12/site-packages/django/db/backends/utils.py\", line 84, in _execute
    with self.db.wrap_database_errors:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/python-runtime/lib/python3.12/site-packages/django/db/utils.py\", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File \"/python-runtime/lib/python3.12/site-packages/django/db/backends/utils.py\", line 89, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/python-runtime/lib/python3.12/site-packages/opentelemetry/instrumentation/psycopg/__init__.py\", line 367, in execute
    return _cursor_tracer.traced_execution(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/python-runtime/lib/python3.12/site-packages/opentelemetry/instrumentation/dbapi/__init__.py\", line 593, in traced_execution
    return query_method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/python-runtime/lib/python3.12/site-packages/psycopg/cursor.py\", line 97, in execute
    raise ex.with_traceback(None)
django.db.utils.OperationalError: query timeout

From: grafana.prod-us.posthog.dev/goto/Q7vrSJivg?orgId=1

Discovered in this ticket: posthoghelp.zendesk.com/agent/tickets/42850

lshaowei18 · 2025-11-21T11:12:25Z

Thank you for your PR @lshaowei18. Logs in production show that these queries are timing out, so batching the query makes sense to me

Logs

Thanks for checking the logs!

Hmmm I was looking into fetchMany and it seems like there is a client cursor and server cursor, I haven't quite wrap my head around it yet. https://medium.com/dev-bits/understanding-postgresql-cursors-with-python-ebc3da591fe7

My mild concern is that cursor.execute will try to load all the rows into the cursor, which may still cause the queries to time out.

In that case #41767 of using multiple cursor.execute will make more sense :0

Just thinking out loud here, let me know if you have any thoughts :)

andyzzhao · 2025-11-21T11:38:50Z

My mild concern is that cursor.execute will try to load all the rows into the cursor, which may still cause the queries to time out.

@lshaowei18 yeah, I think you're right. My assumption was that fetchmany would do multiple queries but it seems like that's not the case.

https://github.com/psycopg/psycopg/blob/b9d533beb5d847ef6837fbd4a011f67730225ffd/psycopg/psycopg/cursor.py#L229-L246

Would you like to update this to the batched execute way instead? I'll approve it if you do.

lshaowei18 · 2025-11-21T13:29:34Z

My mild concern is that cursor.execute will try to load all the rows into the cursor, which may still cause the queries to time out.

@lshaowei18 yeah, I think you're right. My assumption was that fetchmany would do multiple queries but it seems like that's not the case.

https://github.com/psycopg/psycopg/blob/b9d533beb5d847ef6837fbd4a011f67730225ffd/psycopg/psycopg/cursor.py#L229-L246

Would you like to update this to the batched execute way instead? I'll approve it if you do.

Thanks for investigating; I learned something new today :)

I have updated the PR: 23f4258

Please feel free to take over or close since the solution is very similar to your PR + you have test coverage :)

fix: batch the distinct id using fetchmany

7b0ee3a

lshaowei18 commented Nov 21, 2025

View reviewed changes

greptile-apps bot reviewed Nov 21, 2025

View reviewed changes

andyzzhao requested a review from a team November 21, 2025 09:57

Merge branch 'master' into fetch-many-person-strategy

cb4cfc5

andyzzhao mentioned this pull request Nov 21, 2025

fix(query): batch get_actor query to avoid timeout #41767

Draft

fix: update to use multiple executes instead

23f4258

andyzzhao closed this Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(export): CSV export for actor's query fails with postgres error #41879

fix(export): CSV export for actor's query fails with postgres error #41879

Uh oh!

lshaowei18 commented Nov 21, 2025 •

edited

Loading

Uh oh!

lshaowei18 Nov 21, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

andyzzhao commented Nov 21, 2025 •

edited

Loading

Uh oh!

lshaowei18 commented Nov 21, 2025

Uh oh!

andyzzhao commented Nov 21, 2025 •

edited

Loading

Uh oh!

lshaowei18 commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(export): CSV export for actor's query fails with postgres error #41879

fix(export): CSV export for actor's query fails with postgres error #41879

Uh oh!

Conversation

lshaowei18 commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Changes

How did you test this code?

Uh oh!

lshaowei18 Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

andyzzhao commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lshaowei18 commented Nov 21, 2025

Uh oh!

andyzzhao commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lshaowei18 commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lshaowei18 commented Nov 21, 2025 •

edited

Loading

andyzzhao commented Nov 21, 2025 •

edited

Loading

andyzzhao commented Nov 21, 2025 •

edited

Loading