Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from typing import Any

from asgiref.sync import sync_to_async
from django.db import models
from django.db.models import OuterRef, Q, QuerySet, Subquery

from codecov.commands.base import BaseInteractor
Expand Down Expand Up @@ -36,4 +37,28 @@ def execute(self, repository: Repository, filters: dict[str, Any]) -> QuerySet:
| Q(name=repository.branch) # but always include the default branch
)

# Enhance branch information with latest commit activity metrics
if filters.get("include_activity_metrics", True):
Comment on lines +40 to +41
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value for include_activity_metrics is True, which means these potentially expensive subqueries will be executed for all branch fetches unless explicitly disabled. This could have significant performance implications for repositories with many branches and commits.

Consider:

  1. Defaulting to False and requiring callers to opt-in
  2. Adding documentation explaining the performance trade-offs
  3. Ensuring these fields are actually needed by the consumers before enabling by default
  4. Checking if there are appropriate database indexes on (repository_id, branch, deleted, timestamp) for the Commit model

Did we get this right? 👍 / 👎 to inform future reviews.

queryset = queryset.annotate(
latest_commit_timestamp=Subquery(
Commit.objects.filter(
repository_id=OuterRef("repository__repoid"),
branch=OuterRef("branch"),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: The subquery uses OuterRef("branch") on a Branch model queryset, but the correct field for the branch name is name, not branch.
  • Description: The fetch_branches command annotates a Branch model queryset with subqueries that filter Commit objects. These subqueries incorrectly use OuterRef("branch") to reference the branch name from the outer query. The Branch model does not have a branch field; the correct field name is name. This mismatch will raise a FieldError when the query is executed. Since the include_activity_metrics filter that triggers this annotation defaults to True, most API requests to list branches will fail.

  • Suggested fix: In the subqueries used for annotation, replace OuterRef("branch") with OuterRef("name"). This ensures the reference correctly points to the name field of the outer Branch model.
    severity: 0.9, confidence: 0.98

Did we get this right? 👍 / 👎 to inform future reviews.

deleted=False,
)
.order_by("-timestamp")
Comment on lines +44 to +49
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OuterRef("branch") is referencing the wrong field name. The Branch model defines the field as name (with db_column="branch"), so this should be OuterRef("name") to reference the branch name correctly in the Django ORM context.

While this might work due to the db_column mapping, it's inconsistent with Django ORM best practices and could lead to confusion or issues in different contexts.

Did we get this right? 👍 / 👎 to inform future reviews.

.values("timestamp")[:1]
),
total_commits_count=Subquery(
Commit.objects.filter(
repository_id=OuterRef("repository__repoid"),
branch=OuterRef("branch"),
Comment on lines +48 to +55
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The total_commits_count subquery uses an unnecessarily complex pattern. The .values("repository_id").annotate(count=models.Count("id")).values("count")[:1] can be simplified to just .aggregate(count=models.Count("id"))["count"] or use models.Count("*") directly in the subquery. The current pattern may cause Django to generate suboptimal SQL.

Did we get this right? 👍 / 👎 to inform future reviews.

deleted=False,
Comment on lines +41 to +56
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These subqueries will execute for every branch in the result set, which could cause significant performance issues for repositories with many branches. Consider adding database indexes on (repository_id, branch, deleted) and (repository_id, branch, timestamp) for the Commit model. Also consider implementing pagination or limiting the number of branches processed when activity metrics are enabled.

Did we get this right? 👍 / 👎 to inform future reviews.

)
.values("repository_id")
.annotate(count=models.Count("id"))
.values("count")[:1]
),
Comment on lines +52 to +61
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OuterRef("branch") should be OuterRef("name") for the same reason as above. The Branch model's field is named name in Django ORM, not branch.

Did we get this right? 👍 / 👎 to inform future reviews.

)

return queryset
Loading