Skip to content

Conversation

@GowthamShanmugam
Copy link
Contributor

@GowthamShanmugam GowthamShanmugam commented Oct 31, 2025

Description

https://issues.redhat.com/browse/RHOAIENG-37620

How Has This Been Tested?

Screenshot or short clip

Merge criteria

  • You have read the contributors guide.
  • Commit messages are meaningful - have a clear and concise summary and detailed explanation of what was changed and why.
  • Pull Request contains a description of the solution, a link to the JIRA issue, and to any dependent or related Pull Request.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work
  • The developer has run the integration test pipeline and verified that it passed successfully

E2E test suite update requirement

When bringing new changes to the operator code, such changes are by default required to be accompanied by extending and/or updating the E2E test suite accordingly.

To opt-out of this requirement:

  1. Please inspect the opt-out guidelines, to determine if the nature of the PR changes allows for skipping this requirement
  2. If opt-out is applicable, provide justification in the dedicated E2E update requirement opt-out justification section below
  3. Check the checkbox below:
  • Skip requirement to update E2E test suite for this PR
  1. Submit/save these changes to the PR description. This will automatically trigger the check.

E2E update requirement opt-out justification

gateway e2e PR: #2753

Summary by CodeRabbit

  • New Features

    • Added support for custom subdomain configuration in Gateway settings. Users can now optionally specify a subdomain when configuring a gateway to customize the gateway's domain name, with built-in validation for proper DNS naming conventions.
  • Documentation

    • Updated API documentation to reflect the new optional Subdomain configuration field in GatewayConfig.

@coderabbitai
Copy link

coderabbitai bot commented Oct 31, 2025

Walkthrough

The pull request adds an optional Subdomain field to GatewayConfigSpec with DNS validation rules. The domain resolution logic in gateway_support.go is updated to thread subdomain through functions like buildGatewayDomain, getClusterDomain, and resolveDomain. Documentation and test coverage are updated accordingly.

Changes

Cohort / File(s) Summary
API Type Definition
api/services/v1alpha1/gateway_types.go
Added optional Subdomain field to GatewayConfigSpec with kubebuilder validations (MaxLength=63, DNS-compatible pattern).
Documentation
docs/api-overview.md
Updated GatewayConfigSpec specification table to include new subdomain field documentation.
Domain Resolution Logic
internal/controller/services/gateway/gateway_support.go
Refactored domain construction functions to accept and propagate subdomain: buildGatewayDomain(subdomain, baseDomain), updated getClusterDomain(ctx, client, subdomain), and threaded subdomain through resolveDomain() and GetGatewayDomain() entry point.
Test Coverage
internal/controller/services/gateway/gateway_support_test.go
Added TestResolveDomainWithSubdomain and TestBuildGatewayDomainWithSubdomain to validate subdomain-aware domain resolution with custom, empty, and whitespace inputs.
Test Utilities
internal/controller/services/gateway/gateway_util_test.go
Introduced helper function to configure GatewayConfig with optional subdomain for test scenarios.

Sequence Diagram

sequenceDiagram
    participant GetGatewayDomain
    participant resolveDomain
    participant buildGatewayDomain
    participant getClusterDomain

    GetGatewayDomain->>GetGatewayDomain: Extract subdomain from GatewayConfig<br/>(or empty if nil)
    alt GatewayConfig exists with Domain override
        GetGatewayDomain->>resolveDomain: subdomain, gatewayConfig (has Domain)
        resolveDomain->>buildGatewayDomain: subdomain, baseDomain
        buildGatewayDomain-->>resolveDomain: hostname.baseDomain
        resolveDomain-->>GetGatewayDomain: resolved domain
    else No Domain override
        GetGatewayDomain->>resolveDomain: subdomain, gatewayConfig (no Domain)
        resolveDomain->>getClusterDomain: subdomain
        getClusterDomain->>buildGatewayDomain: subdomain, clusterBaseDomain
        buildGatewayDomain-->>getClusterDomain: hostname.clusterBaseDomain
        getClusterDomain-->>resolveDomain: cluster domain
        resolveDomain-->>GetGatewayDomain: resolved domain
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Function signature changes: buildGatewayDomain and getClusterDomain signatures updated to accept subdomain parameter; verify all callers are updated
  • Subdomain fallback logic: Confirm that empty/whitespace subdomains correctly fall back to DefaultGatewayName and that trimming behavior is applied consistently
  • Domain construction: Verify hostname.baseDomain construction is correct when subdomain is provided vs. omitted
  • Test coverage: Ensure new tests comprehensively cover edge cases (empty, whitespace, special characters, nil GatewayConfig)
  • Backward compatibility: Verify that existing domain resolution paths without explicit subdomain still function as before

Poem

🐰 A subdomain hops through functions bright,
Threading DNS names left and right,
With validation rules so tidy and neat,
This config's now perfectly petite!
Hop-hop-hop—the gateway's complete! 🌐

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title refers to adding a 'short name' field to gatewayconfig CRD, but the actual changes implement a 'Subdomain' field, not a 'short name' field. This is a mismatch between the stated title and the actual implementation. Update the PR title to accurately reflect the changes, such as 'RHOAIENG-37620: Add subdomain field to GatewayConfig CRD' to align with the actual implementation.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

This PR can't be merged just yet 😢

Please run make generate manifests api-docs and commit the changes.

For more info: https://github.com/opendatahub-io/opendatahub-operator/actions/runs/18968934882

@GowthamShanmugam GowthamShanmugam changed the title the gatewayconfig CRD needs a "short name" field to override RHOAIENG-37620: the gatewayconfig CRD needs a "short name" field to override Oct 31, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/controller/services/gateway/gateway_support.go (1)

195-204: Propagate unexpected GatewayConfig read failures.

GetGatewayDomain now treats every client.Get error as “config absent” and proceeds with defaults. With a 403 or a transient API outage we’ll fabricate a new hostname, breaking TLS/route alignment and hiding the real problem. Only fall back on NotFound—everything else should bubble so reconciliation can retry and operators see the failure.

@@
-	err := cli.Get(ctx, client.ObjectKey{Name: serviceApi.GatewayInstanceName}, gatewayConfig)
-	if err != nil {
-		// GatewayConfig doesn't exist, use platform-specific default subdomain with cluster domain
-		return resolveDomain(ctx, cli, nil)
-	}
-
-	return resolveDomain(ctx, cli, gatewayConfig)
+	err := cli.Get(ctx, client.ObjectKey{Name: serviceApi.GatewayInstanceName}, gatewayConfig)
+	if err != nil {
+		if k8serr.IsNotFound(err) {
+			return resolveDomain(ctx, cli, nil)
+		}
+		return "", fmt.Errorf("failed to fetch GatewayConfig %q: %w", serviceApi.GatewayInstanceName, err)
+	}
+	return resolveDomain(ctx, cli, gatewayConfig)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09ee029 and 8de8b4b.

📒 Files selected for processing (10)
  • api/services/v1alpha1/gateway_types.go (1 hunks)
  • bundle/manifests/services.platform.opendatahub.io_gatewayconfigs.yaml (1 hunks)
  • config/crd/bases/services.platform.opendatahub.io_gatewayconfigs.yaml (1 hunks)
  • docs/api-overview.md (1 hunks)
  • internal/controller/components/dashboard/dashboard_controller.go (2 hunks)
  • internal/controller/components/dashboard/dashboard_controller_actions.go (2 hunks)
  • internal/controller/components/dashboard/dashboard_support.go (3 hunks)
  • internal/controller/services/gateway/gateway_support.go (4 hunks)
  • internal/controller/services/gateway/gateway_support_test.go (4 hunks)
  • pkg/controller/actions/render/kustomize/action_render_manifests.go (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
internal/controller/components/dashboard/dashboard_controller_actions.go (4)
pkg/controller/types/types.go (1)
  • ReconciliationRequest (74-110)
api/common/types.go (1)
  • Release (159-162)
pkg/cluster/gvk/gvk.go (2)
  • ConfigMap (166-170)
  • Deployment (112-116)
pkg/metadata/labels/types.go (1)
  • ODHAppPrefix (4-4)
internal/controller/services/gateway/gateway_support.go (4)
pkg/cluster/const.go (4)
  • SelfManagedRhoai (9-9)
  • ManagedRhoai (7-7)
  • OpenDataHub (11-11)
  • DefaultNotebooksNamespaceODH (14-14)
pkg/cluster/cluster_config.go (1)
  • GetRelease (89-91)
pkg/cluster/resources.go (1)
  • ApplicationNamespace (139-148)
api/services/v1alpha1/gateway_types.go (1)
  • GatewayConfig (109-115)
internal/controller/components/dashboard/dashboard_controller.go (6)
api/services/v1alpha1/gateway_types.go (1)
  • GatewayConfig (109-115)
pkg/cluster/gvk/gvk.go (1)
  • GatewayConfig (340-344)
pkg/controller/reconciler/reconciler_support.go (2)
  • WithEventHandler (57-61)
  • WithPredicates (51-55)
pkg/controller/handlers/handlers.go (1)
  • ToNamed (58-66)
api/components/v1alpha1/dashboard_types.go (1)
  • DashboardInstanceName (28-28)
pkg/controller/actions/render/kustomize/action_render_manifests.go (1)
  • WithCustomKeyFn (78-82)
pkg/controller/actions/render/kustomize/action_render_manifests.go (3)
pkg/controller/actions/resourcecacher/resourcecacher.go (1)
  • ResourceCacher (18-22)
pkg/controller/actions/cacher/cacher.go (1)
  • CachingKeyFn (14-14)
pkg/controller/types/types.go (1)
  • Hash (186-217)
internal/controller/components/dashboard/dashboard_support.go (5)
pkg/controller/types/types.go (1)
  • ReconciliationRequest (74-110)
api/services/v1alpha1/gateway_types.go (2)
  • GatewayConfig (109-115)
  • GatewayInstanceName (30-30)
pkg/cluster/cluster_config.go (1)
  • GetRelease (89-91)
pkg/cluster/const.go (2)
  • SelfManagedRhoai (9-9)
  • ManagedRhoai (7-7)
pkg/resources/resources.go (2)
  • FormatObjectReference (401-409)
  • ToUnstructured (30-41)

Comment on lines 117 to 121
if err := rr.Client.Get(context.Background(), client.ObjectKey{Name: serviceApi.GatewayInstanceName}, gatewayConfig); err != nil {
// GatewayConfig doesn't exist yet (IsNotFound) or other error
// Use base hash - this handles the initial deployment case
return baseHash, nil //nolint:nilerr
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stop swallowing GatewayConfig read errors in the cache key.

Here rr.Client.Get treats every error—permission denied, throttling, transient API failure—the same as a clean NotFound, so we silently fall back to the base hash and never invalidate the cache even though the ConfigMap actually changed. That regression will keep serving stale params.env long after the gateway updated. Only mask true NotFound; for everything else bubble the error so the reconcile can retry.

@@
-	if err := rr.Client.Get(context.Background(), client.ObjectKey{Name: serviceApi.GatewayInstanceName}, gatewayConfig); err != nil {
-		// GatewayConfig doesn't exist yet (IsNotFound) or other error
-		// Use base hash - this handles the initial deployment case
-		return baseHash, nil //nolint:nilerr
-	}
+	if err := rr.Client.Get(context.Background(), client.ObjectKey{Name: serviceApi.GatewayInstanceName}, gatewayConfig); err != nil {
+		if k8serr.IsNotFound(err) {
+			return baseHash, nil
+		}
+		return nil, fmt.Errorf("failed to fetch GatewayConfig: %w", err)
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if err := rr.Client.Get(context.Background(), client.ObjectKey{Name: serviceApi.GatewayInstanceName}, gatewayConfig); err != nil {
// GatewayConfig doesn't exist yet (IsNotFound) or other error
// Use base hash - this handles the initial deployment case
return baseHash, nil //nolint:nilerr
}
if err := rr.Client.Get(context.Background(), client.ObjectKey{Name: serviceApi.GatewayInstanceName}, gatewayConfig); err != nil {
if k8serr.IsNotFound(err) {
return baseHash, nil
}
return nil, fmt.Errorf("failed to fetch GatewayConfig: %w", err)
}
🤖 Prompt for AI Agents
In internal/controller/components/dashboard/dashboard_support.go around lines
117 to 121, the rr.Client.Get call currently treats every error as NotFound and
returns baseHash, nil; change this so only a true NotFound is masked: check the
error with apierrors.IsNotFound(err) (import k8s.io/apimachinery/pkg/api/errors
as apierrors) and return baseHash, nil only in that case; for any other error
return an empty/zero hash and propagate the error (e.g., return "", err) so the
reconcile can retry and we don't silently serve stale data.

Signed-off-by: Gowtham Shanmugasundaram <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
internal/controller/services/gateway/gateway_support.go (1)

139-157: LGTM! Domain resolution properly handles subdomain in all paths.

The logic correctly:

  • Handles nil gatewayConfig by using empty subdomain
  • Extracts subdomain from spec
  • Threads it through both user-provided and cluster domain paths

The TrimSpace on line 147 is redundant since buildGatewayDomain (line 122) also trims. Consider removing the trim here for minor efficiency:

-	subdomain := strings.TrimSpace(gatewayConfig.Spec.Subdomain)
+	subdomain := gatewayConfig.Spec.Subdomain

Apply the same change at line 172 in GetGatewayDomain.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8de8b4b and 71fc133.

📒 Files selected for processing (5)
  • api/services/v1alpha1/gateway_types.go (1 hunks)
  • docs/api-overview.md (1 hunks)
  • internal/controller/services/gateway/gateway_support.go (2 hunks)
  • internal/controller/services/gateway/gateway_support_test.go (1 hunks)
  • internal/controller/services/gateway/gateway_util_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/api-overview.md
🧰 Additional context used
🧬 Code graph analysis (2)
internal/controller/services/gateway/gateway_util_test.go (1)
api/services/v1alpha1/gateway_types.go (3)
  • GatewayConfig (111-117)
  • GatewayInstanceName (30-30)
  • GatewayConfigSpec (38-61)
internal/controller/services/gateway/gateway_support.go (2)
pkg/cluster/cluster_config.go (1)
  • GetDomain (97-114)
api/services/v1alpha1/gateway_types.go (1)
  • GatewayConfig (111-117)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build/push catalog image
  • GitHub Check: Run tests and collect coverage on internal and pkg
  • GitHub Check: golangci-lint
🔇 Additional comments (6)
internal/controller/services/gateway/gateway_util_test.go (1)

122-139: LGTM! Clean refactoring to support subdomain testing.

The delegation pattern maintains backward compatibility while enabling subdomain-aware test configurations.

api/services/v1alpha1/gateway_types.go (1)

52-56: LGTM! DNS-1123 validation is correctly applied.

The validation rules properly enforce DNS label requirements (lowercase alphanumerics, hyphens, 63-char max), addressing the concern from the previous review.

internal/controller/services/gateway/gateway_support_test.go (1)

326-467: LGTM! Comprehensive test coverage for subdomain functionality.

The test cases thoroughly validate subdomain handling including custom subdomains, empty/whitespace fallbacks, and complex domain structures across both user and cluster domain scenarios.

internal/controller/services/gateway/gateway_support.go (3)

118-128: LGTM! Subdomain handling is correct.

The function properly trims whitespace and falls back to DefaultGatewayName when subdomain is empty, ensuring valid DNS labels are always constructed.


131-137: LGTM! Cluster domain resolution now subdomain-aware.

The updated signature correctly threads subdomain through to buildGatewayDomain for cluster-based domain resolution.


162-182: LGTM! GetGatewayDomain correctly implements subdomain-aware resolution.

The function properly retrieves the GatewayConfig, extracts the subdomain, and follows the same domain resolution logic as resolveDomain, maintaining consistency across the codebase.

@GowthamShanmugam
Copy link
Contributor Author

/test opendatahub-operator-e2e-hypershift

1 similar comment
@GowthamShanmugam
Copy link
Contributor Author

/test opendatahub-operator-e2e-hypershift

@GowthamShanmugam
Copy link
Contributor Author

/override ci/prow/opendatahub-operator-e2e-hypershift

@openshift-ci
Copy link

openshift-ci bot commented Nov 6, 2025

@GowthamShanmugam: Overrode contexts on behalf of GowthamShanmugam: ci/prow/opendatahub-operator-e2e-hypershift

In response to this:

/override ci/prow/opendatahub-operator-e2e-hypershift

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@codecov
Copy link

codecov bot commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 73.33333% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.82%. Comparing base (dd7e7bd) to head (71fc133).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
...nal/controller/services/gateway/gateway_support.go 73.33% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2790      +/-   ##
==========================================
- Coverage   49.88%   49.82%   -0.06%     
==========================================
  Files         145      145              
  Lines       10427    10432       +5     
==========================================
- Hits         5201     5198       -3     
- Misses       4672     4680       +8     
  Partials      554      554              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@resoluteCoder resoluteCoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small nits as to some comments that I think can be removed as the code describes what is happening but other than that LGTM! 😄

Tested on ROSA cluster with subdomain and domain changes. Used /echo demo app for subdomain.

@openshift-ci
Copy link

openshift-ci bot commented Nov 6, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: resoluteCoder

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Nov 6, 2025
@GowthamShanmugam
Copy link
Contributor Author

/retest

1 similar comment
@GowthamShanmugam
Copy link
Contributor Author

/retest

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f7ba276 and 2 for PR HEAD 71fc133 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD dd2e79f and 1 for PR HEAD 71fc133 in total

@openshift-merge-bot openshift-merge-bot bot merged commit 56c2f37 into opendatahub-io:main Nov 7, 2025
19 checks passed
@github-project-automation github-project-automation bot moved this from Todo to Done in ODH Platform Planning Nov 7, 2025
@GowthamShanmugam
Copy link
Contributor Author

/cherry-pick rhoai

@openshift-cherrypick-robot

@GowthamShanmugam: new pull request created: #2825

In response to this:

/cherry-pick rhoai

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

jctanner pushed a commit to jctanner-opendatahub-io/opendatahub-operator that referenced this pull request Nov 9, 2025
Signed-off-by: Gowtham Shanmugasundaram <[email protected]>
(cherry picked from commit 56c2f37)
openshift-merge-bot bot pushed a commit that referenced this pull request Nov 10, 2025
(cherry picked from commit 56c2f37)

Signed-off-by: Gowtham Shanmugasundaram <[email protected]>
Co-authored-by: Gowtham Shanmugasundaram <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants