Skip to content

Commit 8b9049d

Browse files
authored
Create initial documentation (#2)
* feat(docs): Add contribution, testing, governance, and onboarding docs Signed-off-by: Giulio Frasca <[email protected]> * feat(docs): Add placeholder BESTPRACTICES.md and AGENTS.md Signed-off-by: Giulio Frasca <[email protected]> * chore: Correct Repo name/URL in docs Signed-off-by: Giulio Frasca <[email protected]> * chore: Add link to KSC role defintion Signed-off-by: Giulio Frasca <[email protected]> * chore: Move Development Workflow to ONBOARDING.md Signed-off-by: Giulio Frasca <[email protected]> * docs: Reorganize CONTRIBUTING and ONBOARDING docs Signed-off-by: Giulio Frasca <[email protected]> * chore: Add clarifications to docs Signed-off-by: Giulio Frasca <[email protected]> --------- Signed-off-by: Giulio Frasca <[email protected]>
1 parent 00758af commit 8b9049d

File tree

5 files changed

+560
-0
lines changed

5 files changed

+560
-0
lines changed

docs/AGENTS.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# AI Agent Context Guide
2+
3+
*This guide provides context and information for AI agents working with the Kubeflow Pipelines Components Repository.*
4+
5+
## Coming Soon
6+
7+
This document serves as a comprehensive context source for AI agents to understand:
8+
- Repository structure and organization
9+
- Component development patterns and standards
10+
- Contribution workflows and processes
11+
- Code quality requirements and testing practices
12+
- Community guidelines and governance
13+
14+
---
15+
16+
For immediate guidance, see:
17+
- [Contributing Guide](CONTRIBUTING.md)
18+
- [Governance Guide](GOVERNANCE.md)
19+
- [Best Practices Guide](BESTPRACTICES.md)

docs/BESTPRACTICES.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Component Development Best Practices
2+
3+
*This guide is under development. Please check back soon for comprehensive best practices for developing Kubeflow Pipelines components.*
4+
5+
## Coming Soon
6+
7+
This document will cover:
8+
- Component design patterns
9+
- Performance optimization
10+
- Security best practices
11+
- Error handling strategies
12+
- Documentation standards
13+
- Testing methodologies
14+
- Container optimization
15+
- Resource management
16+
17+
---
18+
19+
For immediate guidance, see:
20+
- [Contributing Guide](CONTRIBUTING.md) - Complete guide with testing, setup, and workflow
21+
- [Governance Guide](GOVERNANCE.md) - Repository policies and tiers

docs/CONTRIBUTING.md

Lines changed: 320 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,320 @@
1+
# Contributing to Kubeflow Pipelines Components
2+
3+
Welcome! This guide covers everything you need to know to contribute components and pipelines to this repository.
4+
5+
## Table of Contents
6+
7+
- [Prerequisites](#prerequisites)
8+
- [Quick Setup](#quick-setup)
9+
- [What We Accept](#what-we-accept)
10+
- [Component Structure](#component-structure)
11+
- [Development Workflow](#development-workflow)
12+
- [Testing and Quality](#testing-and-quality)
13+
- [Submitting Your Contribution](#submitting-your-contribution)
14+
- [Getting Help](#getting-help)
15+
16+
## Prerequisites
17+
18+
Before contributing, ensure you have the following tools installed:
19+
20+
- **Python 3.10+** for component development
21+
- **uv** ([installation guide](https://docs.astral.sh/uv/getting-started/installation)) to manage Python dependencies including `kfp` and `kfp-kubernetes` packages
22+
- **pre-commit** ([installation guide](https://pre-commit.com/#installation)) for automated code quality checks
23+
- **Docker or Podman** to build container images for custom components
24+
- **kubectl** ([installation guide](https://kubernetes.io/docs/tasks/tools/)) for Kubernetes operations
25+
26+
All contributors must follow the [Kubeflow Community Code of Conduct](https://github.com/kubeflow/community/blob/master/CODE_OF_CONDUCT.md).
27+
28+
## Quick Setup
29+
30+
Get your development environment ready with these commands:
31+
32+
```bash
33+
# Fork and clone the repository
34+
git clone https://github.com/YOUR_USERNAME/pipelines-components.git
35+
cd pipelines-components
36+
git remote add upstream https://github.com/kubeflow/pipelines-components.git
37+
38+
# Set up Python environment
39+
uv venv
40+
source .venv/bin/activate
41+
uv pip install -r requirements-dev.txt
42+
43+
# Install pre-commit hooks for automatic code quality checks
44+
pre-commit install
45+
46+
# Verify your setup works
47+
pytest
48+
```
49+
50+
## What We Accept
51+
52+
We welcome contributions of production-ready ML components and re-usable pipelines:
53+
54+
- **Components** are individual ML tasks (data processing, training, evaluation, deployment) with usage examples
55+
- **Pipelines** are complete multi-step workflows that can be nested within other pipelines
56+
- **Bug fixes** improve existing components or fix documentation issues
57+
58+
## Component Structure
59+
60+
Components must be organized by category under `components/<category>/` (Core tier) or `third_party/components/<category>/` (Third-Party tier).
61+
62+
Pipelines must be organized by category under `pipelines/<category>/` (Core tier) or `third_party/pipelines/<category>/` (Third-Party tier).
63+
64+
## Naming Conventions
65+
66+
- **Components and pipelines** use `snake_case` (e.g., `data_preprocessing`, `model_trainer`)
67+
- **Commit messages** follow [Conventional Commits](https://conventionalcommits.org/) format with type prefix (feat, fix, docs, etc.)
68+
69+
### Required Files
70+
71+
Every component must include these files in its directory:
72+
73+
```
74+
components/<category>/<component_name>/
75+
├── __init__.py # Exposes the component function for imports
76+
├── component.py # Main implementation
77+
├── metadata.yaml # Complete specification (see schema below)
78+
├── README.md # Overview, inputs/outputs, usage examples, development instructions
79+
├── OWNERS # Maintainers (at least one Kubeflow SIG owner for Core tier)
80+
├── Containerfile # Container definition (required only for Core tier custom images)
81+
├── example_pipelines.py # Working usage examples
82+
└── tests/
83+
│ └── test_component.py # Unit tests
84+
└── <supporting_files>
85+
```
86+
87+
Similarly, every pipeline must include these files:
88+
```
89+
pipelines/<category>/<pipeline_name>/
90+
├── __init__.py # Exposes the pipeline function for imports
91+
├── pipeline.py # Main implementation
92+
├── metadata.yaml # Complete specification (see schema below)
93+
├── README.md # Overview, inputs/outputs, usage examples, development instructions
94+
├── OWNERS # Maintainers (at least one Kubeflow SIG owner for Core tier)
95+
├── example_pipelines.py # Working usage examples
96+
└── tests/
97+
│ └── test_pipeline.py # Unit tests
98+
└── <supporting_files>
99+
```
100+
101+
### metadata.yaml Schema
102+
103+
Your `metadata.yaml` must include these fields:
104+
105+
```yaml
106+
name: my_component
107+
tier: core # or 'third_party'
108+
stability: stable # 'alpha', 'beta', or 'stable'
109+
dependencies:
110+
kubeflow:
111+
- name: Pipelines
112+
version: '>=2.5'
113+
external_services: # Optional list of external dependencies
114+
- name: Argo Workflows
115+
version: "3.6"
116+
tags: # Optional keywords for discoverability
117+
- training
118+
- evaluation
119+
lastVerified: 2025-11-18T00:00:00Z # Updated annually; components are removed after 12 months without update
120+
ci:
121+
compile_check: true # Validates component compiles with kfp.compiler
122+
skip_dependency_probe: false # Optional. Set true only with justification
123+
pytest: optional # Set to 'required' for Core tier
124+
links: # Optional, can use custom key-value (not limited to documentation, issue_tracker)
125+
documentation: https://kubeflow.org/components/my_component
126+
issue_tracker: https://github.com/kubeflow/pipelines-components/issues
127+
```
128+
129+
### OWNERS File
130+
131+
The OWNERS file enables component owners to self-service maintenance tasks including approvals, metadata updates, and lifecycle management:
132+
133+
```yaml
134+
approvers:
135+
- maintainer1 # At least one must be a Kubeflow SIG owner/team member for Core tier
136+
- maintainer2
137+
reviewers:
138+
- reviewer1
139+
```
140+
141+
The `OWNERS` file enables code review automation by leveraging PROW commands:
142+
- **Reviewers** (as well as **Approvers**), upon reviewing a PR and finding it good to merge, can comment `/lgtm`, which applies the `lgtm` label to the PR
143+
- **Approvers** (but not **Reviewers**) can comment `/approver`, which signfies the PR is approved for automation to merge into the repo.
144+
- If a PR has been labeled with both `lgtm` and `approve`, and all required CI checks are passing, PROW will merge the PR into the destination branch.
145+
146+
See [full Prow documentation](https://docs.prow.k8s.io/docs/components/plugins/approve/approvers/#lgtm-label) for usage details.
147+
148+
149+
150+
## Development Workflow
151+
152+
### 1. Create Your Feature Branch
153+
154+
Start by syncing with upstream and creating a feature branch:
155+
156+
```bash
157+
git fetch upstream
158+
git checkout main
159+
git merge upstream/main
160+
git checkout -b component/my-component
161+
```
162+
163+
### 2. Implement Your Component
164+
165+
Create your component following the structure above. Here's a basic template:
166+
167+
```python
168+
# component.py
169+
from kfp import dsl
170+
171+
@dsl.component(base_image="python:3.10")
172+
def hello_world(name: str = "World") -> str:
173+
"""A simple hello world component.
174+
175+
Args:
176+
name: The name to greet. Defaults to "World".
177+
178+
Returns:
179+
A greeting message.
180+
"""
181+
message = f"Hello, {name}!"
182+
print(message)
183+
return message
184+
```
185+
186+
Write comprehensive tests for your component:
187+
188+
```python
189+
# tests/test_component.py
190+
from ..component import hello_world
191+
192+
def test_hello_world_default():
193+
"""Test hello_world with default parameter."""
194+
# Access the underlying Python function from the component
195+
result = hello_world.python_func()
196+
assert result == "Hello, World!"
197+
198+
199+
def test_hello_world_custom_name():
200+
"""Test hello_world with custom name."""
201+
result = hello_world.python_func(name="Kubeflow")
202+
assert result == "Hello, Kubeflow!"
203+
```
204+
205+
### 3. Document Your Component
206+
207+
This repository requires a standardized README.md. As such, we have provided a README Generation utility, which can be found in the `scripts` directory.
208+
209+
Read more in the [README Generator Script Documentation](./scripts/generate_readme/README.md)
210+
211+
## Testing and Quality
212+
213+
### Running Tests Locally
214+
215+
Run these commands from your component/pipeline directory before submitting your contribution:
216+
217+
```bash
218+
# Run all unit tests with coverage reporting
219+
pytest --cov=src --cov-report=html
220+
221+
# Run specific test files when debugging
222+
pytest tests/test_my_component.py -v
223+
```
224+
225+
### Code Quality Checks
226+
227+
Ensure your code meets quality standards:
228+
229+
```bash
230+
# Format checking (120 character line length)
231+
black --check --line-length 120 .
232+
233+
# Docstring validation (Google convention)
234+
pydocstyle --convention=google .
235+
236+
# Validate metadata schema
237+
python scripts/validate_metadata.py
238+
239+
# Run all pre-commit hooks
240+
pre-commit run --all-files
241+
```
242+
243+
### Building Custom Container Images
244+
245+
If your component uses a custom image, test the container build:
246+
247+
```bash
248+
# Build your component image
249+
docker build -t my-component:test components/<category>/my-component/
250+
251+
# Test the container runs correctly
252+
docker run --rm my-component:test --help
253+
```
254+
255+
### CI Pipeline
256+
257+
GitHub Actions automatically runs these checks on every pull request:
258+
259+
- Code formatting (Black), linting (Flake8), docstring validation (pydocstyle), type checking (MyPy)
260+
- Unit and integration tests with coverage reporting
261+
- Container image builds for components with Containerfiles
262+
- Security vulnerability scans
263+
- Metadata schema validation
264+
- Standardized README content and formatting conformance
265+
266+
## Submitting Your Contribution
267+
268+
### Commit Your Changes
269+
270+
Use descriptive commit messages following the [Conventional Commits](https://conventionalcommits.org/) format:
271+
272+
```bash
273+
git add .
274+
git status # Review what you're committing
275+
git diff --cached # Check the actual changes
276+
277+
git commit -m "feat(training): add <my_component> training component
278+
279+
- Implements <my_component> Core-Tier component
280+
- Includes comprehensive unit tests with 95% coverage
281+
- Provides working pipeline examples
282+
- Resolves #123"
283+
```
284+
285+
### Push and Create Pull Request
286+
287+
Push your changes and create a pull request on GitHub:
288+
289+
```bash
290+
git push origin component/my-component
291+
```
292+
293+
On GitHub, click "Compare & pull request" and fill out the PR template provided with appropriate details
294+
295+
All PRs must pass:
296+
- Automated checks (linting, tests, builds)
297+
- Code review by maintainers and community members
298+
- Documentation review
299+
300+
### Review Process
301+
302+
All pull requests must complete the following:
303+
- All Automated CI checks successfully passing
304+
- Code Review - reviewers will verify the following:
305+
- Component works as described
306+
- Code is clean and well-documented
307+
- Included tests provide good coverage.
308+
- Receive approval from component OWNERS (for updates to existing components) or repository maintainers (for new components)
309+
310+
## Getting Help
311+
312+
- **Governance questions**: See [GOVERNANCE.md](GOVERNANCE.md) for tier requirements and processes
313+
- **Community discussion**: Join `#kubeflow-pipelines` channel on the [CNCF Slack](https://www.kubeflow.org/docs/about/community/#kubeflow-slack-channels)
314+
- **Bug reports and feature requests**: Open an issue at [GitHub Issues](https://github.com/kubeflow/pipelines-components/issues)
315+
316+
---
317+
318+
This repository was established through [KEP-913: Components Repository](https://github.com/kubeflow/community/tree/master/proposals/913-components-repo).
319+
320+
Thanks for contributing to Kubeflow Pipelines! 🚀

0 commit comments

Comments
 (0)