Skip to content

Conversation

@wassimbensalem
Copy link

Add Dict Parameter Access Feature for Pipeline Parameters

Summary

This PR adds support for extracting individual values from dictionary pipeline parameters using Pythonic syntax, eliminating the need to pass entire dictionaries to components that only need specific values.

Motivation

Previously, when a pipeline had a dictionary parameter, the entire dictionary had to be passed to every component, even if the component only needed a single value or subset. This led to:

  • Unnecessary data exposure to components
  • Reduced code clarity
  • Potential security concerns with passing sensitive data unnecessarily

Changes

1. New DictSubvariable Class (sdk/python/kfp/dsl/pipeline_channel.py)

  • Introduced DictSubvariable class to represent values extracted from dict parameters
  • Added __getitem__ method to PipelineParameterChannel to enable dict-style access
  • Supports both single-level and nested dict access
  • Supports passing sub-dictionaries to components

2. Compiler Support (sdk/python/kfp/compiler/pipeline_spec_builder.py)

  • Extended compiler to recognize DictSubvariable instances
  • Generates appropriate CEL (Common Expression Language) parameter_expression_selector expressions
  • Supports nested access by building chained CEL expressions
  • Works in both task and group contexts

3. Type Checking (sdk/python/kfp/dsl/pipeline_task.py)

  • Modified type compatibility checking to skip strict validation for DictSubvariable
  • Actual types are resolved at runtime by the CEL evaluator

4. Tests

  • Added comprehensive unit tests in pipeline_channel_test.py
  • Added integration test pipeline in test_data/pipeline_files/valid/pipeline_with_dict_parameter_access.py

Usage Examples

Single-Level Access

@dsl.pipeline
def my_pipeline(config: dict):
    # Extract a single value
    component1(db_host=config['db_host'])

Nested Access

@dsl.pipeline
def my_pipeline(config: dict):
    # Access nested values
    component2(host=config['database']['host'])
    component3(username=config['database']['credentials']['username'])

Sub-Dictionary Passing

@dsl.pipeline
def my_pipeline(config: dict):
    # Pass an entire sub-dictionary
    component4(db_config=config['database'])

Technical Details

Compile-Time vs Runtime

  • Compile-time: The SDK creates DictSubvariable objects and the compiler generates CEL expressions
  • Runtime: The backend driver evaluates CEL expressions to extract values from the actual JSON data

CEL Expression Generation

  • Single: parseJson(string_value)["key"]
  • Nested: parseJson(string_value)["database"]["host"]
  • Backend already supports these expressions (no backend changes required)

Benefits

  1. Improved Security: Components only receive the data they need
  2. Better Code Clarity: Clear intent about what data each component uses
  3. Reduced Coupling: Components don't depend on entire config structures
  4. Type Safety: Type checking happens at runtime based on actual values
  5. Backward Compatible: Existing code continues to work unchanged

Testing

Run unit tests:

pytest -v sdk/python/kfp/dsl/pipeline_channel_test.py::DictSubvariableTest

Run integration test:

python test_data/pipeline_files/valid/pipeline_with_dict_parameter_access.py

Checklist

  • Added unit tests
  • Added integration test
  • Code follows project formatting standards (isort)
  • All tests pass
  • No breaking changes
  • Documentation (docstrings) added

Related Issues

Closes #12418

Additional Notes

This feature leverages existing backend CEL expression evaluation capabilities, so no server-side changes are required. All changes are SDK-side and applied at compile time.

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign james-jwu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow
Copy link

Hi @wassimbensalem. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wassimbensalem
Copy link
Author

/assign

Add support for extracting individual values from dictionary pipeline
parameters using Pythonic dict-style syntax (config['key']).

This enables:
- Single-level access: config['db_host']
- Nested access: config['database']['host']
- Sub-dict passing: config['database']

Changes:
- Add DictSubvariable class to represent extracted dict values
- Add __getitem__ to PipelineParameterChannel for dict-style access
- Extend compiler to generate CEL parameter_expression_selector
- Skip strict type checking for DictSubvariable (runtime resolution)
- Add comprehensive unit and integration tests

Benefits:
- Improved security (components receive only needed data)
- Better code clarity and reduced coupling
- No backend changes required (uses existing CEL evaluation)
- Fully backward compatible

Example usage:
  @dsl.pipeline
  def my_pipeline(config: dict):
      component1(host=config['database']['host'])
      component2(db_config=config['database'])

Signed-off-by: wassimbensalem <[email protected]>
@wassimbensalem wassimbensalem force-pushed the feature/dict-parameter-access branch from d49613f to 4c41ccf Compare November 7, 2025 10:56
@hbelmiro
Copy link
Contributor

hbelmiro commented Nov 7, 2025

/ok-to-test

Signed-off-by: wassimbensalem <[email protected]>
@wassimbensalem wassimbensalem force-pushed the feature/dict-parameter-access branch from ba16844 to 6ac511a Compare November 10, 2025 10:59
@hbelmiro
Copy link
Contributor

/ok-to-test

wassimbensalem and others added 2 commits November 10, 2025 13:35
- Remove trailing whitespace
- Fix line length issues by breaking long lines
- Reformat function arguments to comply with yapf style

Signed-off-by: wassimbensalem <[email protected]>
@hbelmiro
Copy link
Contributor

/ok-to-test

@wassimbensalem
Copy link
Author

Is anything still missing here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[sdk] Feature Request: Dict Parameter Access for Pipeline Parameters

2 participants