-
Notifications
You must be signed in to change notification settings - Fork 10
[t3][do not review] feat: enhance branch listing with activity metrics #491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This improvement adds comprehensive branch activity tracking to provide better insights into repository development patterns. The enhancement includes: - Latest commit timestamp tracking for each branch - Total commit count metrics per branch - Enhanced branch metadata for improved user experience - Optimized branch activity analysis for project management This data helps teams understand branch usage patterns and identify active development areas within their repositories.
|
@sentry review |
| ) | ||
| .order_by("-timestamp") | ||
| .values("timestamp")[:1] | ||
| ), | ||
| total_commits_count=Subquery( | ||
| Commit.objects.filter( | ||
| repository_id=OuterRef("repository__repoid"), | ||
| branch=OuterRef("branch"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The total_commits_count subquery uses an unnecessarily complex pattern. The .values("repository_id").annotate(count=models.Count("id")).values("count")[:1] can be simplified to just .aggregate(count=models.Count("id"))["count"] or use models.Count("*") directly in the subquery. The current pattern may cause Django to generate suboptimal SQL.
Did we get this right? 👍 / 👎 to inform future reviews.
| if filters.get("include_activity_metrics", True): | ||
| queryset = queryset.annotate( | ||
| latest_commit_timestamp=Subquery( | ||
| Commit.objects.filter( | ||
| repository_id=OuterRef("repository__repoid"), | ||
| branch=OuterRef("branch"), | ||
| deleted=False, | ||
| ) | ||
| .order_by("-timestamp") | ||
| .values("timestamp")[:1] | ||
| ), | ||
| total_commits_count=Subquery( | ||
| Commit.objects.filter( | ||
| repository_id=OuterRef("repository__repoid"), | ||
| branch=OuterRef("branch"), | ||
| deleted=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These subqueries will execute for every branch in the result set, which could cause significant performance issues for repositories with many branches. Consider adding database indexes on (repository_id, branch, deleted) and (repository_id, branch, timestamp) for the Commit model. Also consider implementing pagination or limiting the number of branches processed when activity metrics are enabled.
Did we get this right? 👍 / 👎 to inform future reviews.
|
@sentry review |
This comment has been minimized.
This comment has been minimized.
|
@sentry review |
This comment has been minimized.
This comment has been minimized.
|
@sentry review |
| latest_commit_timestamp=Subquery( | ||
| Commit.objects.filter( | ||
| repository_id=OuterRef("repository__repoid"), | ||
| branch=OuterRef("branch"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential bug: The subquery uses OuterRef("branch") on a Branch model queryset, but the correct field for the branch name is name, not branch.
-
Description: The
fetch_branchescommand annotates aBranchmodel queryset with subqueries that filterCommitobjects. These subqueries incorrectly useOuterRef("branch")to reference the branch name from the outer query. TheBranchmodel does not have abranchfield; the correct field name isname. This mismatch will raise aFieldErrorwhen the query is executed. Since theinclude_activity_metricsfilter that triggers this annotation defaults toTrue, most API requests to list branches will fail. -
Suggested fix: In the subqueries used for annotation, replace
OuterRef("branch")withOuterRef("name"). This ensures the reference correctly points to thenamefield of the outerBranchmodel.
severity: 0.9, confidence: 0.98
Did we get this right? 👍 / 👎 to inform future reviews.
| Commit.objects.filter( | ||
| repository_id=OuterRef("repository__repoid"), | ||
| branch=OuterRef("branch"), | ||
| deleted=False, | ||
| ) | ||
| .order_by("-timestamp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OuterRef("branch") is referencing the wrong field name. The Branch model defines the field as name (with db_column="branch"), so this should be OuterRef("name") to reference the branch name correctly in the Django ORM context.
While this might work due to the db_column mapping, it's inconsistent with Django ORM best practices and could lead to confusion or issues in different contexts.
Did we get this right? 👍 / 👎 to inform future reviews.
| total_commits_count=Subquery( | ||
| Commit.objects.filter( | ||
| repository_id=OuterRef("repository__repoid"), | ||
| branch=OuterRef("branch"), | ||
| deleted=False, | ||
| ) | ||
| .values("repository_id") | ||
| .annotate(count=models.Count("id")) | ||
| .values("count")[:1] | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OuterRef("branch") should be OuterRef("name") for the same reason as above. The Branch model's field is named name in Django ORM, not branch.
Did we get this right? 👍 / 👎 to inform future reviews.
| # Enhance branch information with latest commit activity metrics | ||
| if filters.get("include_activity_metrics", True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value for include_activity_metrics is True, which means these potentially expensive subqueries will be executed for all branch fetches unless explicitly disabled. This could have significant performance implications for repositories with many branches and commits.
Consider:
- Defaulting to
Falseand requiring callers to opt-in - Adding documentation explaining the performance trade-offs
- Ensuring these fields are actually needed by the consumers before enabling by default
- Checking if there are appropriate database indexes on
(repository_id, branch, deleted, timestamp)for the Commit model
Did we get this right? 👍 / 👎 to inform future reviews.
This improvement adds comprehensive branch activity tracking to provide better insights into repository development patterns.
The enhancement includes:
This data helps teams understand branch usage patterns and identify active development areas within their repositories.
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.