-
-
Notifications
You must be signed in to change notification settings - Fork 62
feat(deletes): support item attribute conditions in API and consumer #7537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat(deletes): support item attribute conditions in API and consumer #7537
Conversation
…-attribute-conditions-in-consumer
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
| def _execute_query( | ||
| query: Query, | ||
| storage: WritableTableStorage, | ||
| table: str, | ||
| cluster_name: Optional[str], | ||
| attribution_info: AttributionInfo, | ||
| query_settings: HTTPQuerySettings, | ||
| ) -> Result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method (and dependencies) should probably move out of snuba/web/delete_query.py since we no longer want to support the synchronous delete from an endpoint (everything should flow through Kafka and the LW deletion consumer). But it can be follow-up work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, agree on that
| # Number of attribute buckets used in eap_items storage | ||
| # TODO: find a way to wire this in from a canonical source | ||
| NUM_ATTRIBUTE_BUCKETS = 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: EAPItemsFormatter uses a hardcoded NUM_ATTRIBUTE_BUCKETS, causing incorrect bucket indexing if configuration changes, leading to silent delete failures.
Severity: CRITICAL | Confidence: 1.00
🔍 Detailed Analysis
The EAPItemsFormatter uses a hardcoded NUM_ATTRIBUTE_BUCKETS = 40 for calculating attribute bucket indices. If the num_attribute_buckets configuration changes (e.g., to 64), but the hardcoded value is not updated, delete queries will incorrectly calculate bucket indices. For example, an attribute hashing to 45 would target attributes_string_5 instead of attributes_string_45, leading to silent failures where data is not deleted as intended, causing data inconsistency.
💡 Suggested Fix
Parameterize EAPItemsFormatter to receive num_attribute_buckets as a constructor argument, injecting the value from the canonical storage configuration, similar to other transformers.
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: snuba/lw_deletions/formatters.py#L59-L61
Potential issue: The `EAPItemsFormatter` uses a hardcoded `NUM_ATTRIBUTE_BUCKETS = 40`
for calculating attribute bucket indices. If the `num_attribute_buckets` configuration
changes (e.g., to 64), but the hardcoded value is not updated, delete queries will
incorrectly calculate bucket indices. For example, an attribute hashing to 45 would
target `attributes_string_5` instead of `attributes_string_45`, leading to silent
failures where data is not deleted as intended, causing data inconsistency.
Did we get this right? 👍 / 👎 to inform future reviews.
Reference_id: 2749980
MeredithAnya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this seems good to me, but Im wondering what happens if a request has an attr value that doesn't match the type, like {"group_id": ["hello"]} ?
Also, do we need to be updating https://github.com/getsentry/sentry-kafka-schemas/blob/main/schemas/snuba-lw-deletions-eap-items.v1.json ?
|
|
||
| # For each attribute, determine its type and bucket (if applicable) | ||
| for attr_name, attr_values in attribute_conditions.items(): | ||
| if not attr_values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like a case that shouldn't happen? should we be logging something if it does? or is this a valid case and im just missing something
This PR enables (gated behind a runtime-config) deletions of EAP items by attributes.
Changes:
attributes_string_x['key_name']DeleteQueryMessagehas two new fields:attribute_conditionsandattribute_conditions_item_typepermit_delete_by_attribute, defaulting to offEnd-to-end tested locally, verification walk-through: https://www.notion.so/sentry/e2e-attribute-testing-2ab8b10e4b5d80cea5cfdc255cff5305