-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[DNM][ci] Break down test-pipeline.yaml into test areas #29343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
khluu
wants to merge
25
commits into
main
Choose a base branch
from
khluu/refactor_ci
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,521
−0
Open
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
a7f11ca
p
khluu a0d641a
Merge branch 'main' into khluu/refactor_ci
khluu 27a893b
move primerl
khluu 7701154
test
khluu 0a7642e
update pipeline yaml
khluu 265bf9f
key
khluu 0784707
build files
khluu c8707ff
permission
khluu b9a6433
key change
khluu ac82065
depends_on for build job
khluu 8b886aa
Revert "[CI] fix url-encoding behavior in nightly metadata generation…
khluu 95b4cdf
Revert "[CI] Renovation of nightly wheel build & generation (#29690)"
khluu f3fb422
Merge branch 'main' into khluu/refactor_ci
khluu d4d268c
sync
khluu 1ad5b4d
sync
khluu c1629aa
fix long command
khluu 234c89e
Merge branch 'main' into khluu/refactor_ci
khluu 950643d
debug
khluu 98a38d1
slashes
khluu a020a18
slashes
khluu f2e32c9
2node test
khluu e35d711
switch pushd to cd
khluu 54cb602
remove old file
khluu cde1d84
build
khluu 89a0c29
run all patterns
khluu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| name: vllm_ci | ||
| job_dirs: | ||
| - ".buildkite/test_areas" | ||
| - ".buildkite/image_build" | ||
| run_all_patterns: | ||
| - "docker/Dockerfile" | ||
| - "CMakeLists.txt" | ||
| - "requirements/common.txt" | ||
| - "requirements/cuda.txt" | ||
| - "requirements/build.txt" | ||
| - "requirements/test.txt" | ||
| - "setup.py" | ||
| - "csrc/" | ||
| - "cmake/" | ||
| run_all_exclude_patterns: | ||
| - "docker/Dockerfile." | ||
| - "csrc/cpu/" | ||
| - "csrc/rocm/" | ||
| - "cmake/hipify.py" | ||
| - "cmake/cpu_extension.cmake" | ||
| registries: public.ecr.aws/q9t5s3a7 | ||
| repositories: | ||
| main: "vllm-ci-postmerge-repo" | ||
| premerge: "vllm-ci-test-repo" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| #!/bin/bash | ||
| set -e | ||
|
|
||
| if [[ $# -lt 8 ]]; then | ||
| echo "Usage: $0 <registry> <repo> <commit> <branch> <vllm_use_precompiled> <vllm_merge_base_commit> <cache_from> <cache_to>" | ||
| exit 1 | ||
| fi | ||
|
|
||
| REGISTRY=$1 | ||
| REPO=$2 | ||
| BUILDKITE_COMMIT=$3 | ||
| BRANCH=$4 | ||
| VLLM_USE_PRECOMPILED=$5 | ||
| VLLM_MERGE_BASE_COMMIT=$6 | ||
| CACHE_FROM=$7 | ||
| CACHE_TO=$8 | ||
|
|
||
| # authenticate with AWS ECR | ||
| aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin $REGISTRY | ||
| aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 936637512419.dkr.ecr.us-east-1.amazonaws.com | ||
|
|
||
| # docker buildx | ||
| docker buildx create --name vllm-builder --driver docker-container --use | ||
| docker buildx inspect --bootstrap | ||
| docker buildx ls | ||
|
|
||
| # skip build if image already exists | ||
| if [[ -z $(docker manifest inspect $REGISTRY/$REPO:$BUILDKITE_COMMIT) ]]; then | ||
| echo "Image not found, proceeding with build..." | ||
| else | ||
| echo "Image found" | ||
| exit 0 | ||
| fi | ||
|
|
||
| if [[ "${VLLM_USE_PRECOMPILED:-0}" == "1" ]]; then | ||
| merge_base_commit_build_args="--build-arg VLLM_MERGE_BASE_COMMIT=${VLLM_MERGE_BASE_COMMIT}" | ||
| else | ||
| merge_base_commit_build_args="" | ||
| fi | ||
|
|
||
| # build | ||
| docker buildx build --file docker/Dockerfile \ | ||
| --build-arg max_jobs=16 \ | ||
| --build-arg buildkite_commit=$BUILDKITE_COMMIT \ | ||
| --build-arg USE_SCCACHE=1 \ | ||
| --build-arg TORCH_CUDA_ARCH_LIST="8.0 8.9 9.0 10.0" \ | ||
| --build-arg FI_TORCH_CUDA_ARCH_LIST="8.0 8.9 9.0a 10.0a" \ | ||
| --build-arg VLLM_USE_PRECOMPILED="${VLLM_USE_PRECOMPILED:-0}" \ | ||
| ${merge_base_commit_build_args} \ | ||
| --cache-from type=registry,ref=${CACHE_FROM},mode=max \ | ||
| --cache-to type=registry,ref=${CACHE_TO},mode=max \ | ||
| --tag ${REGISTRY}/${REPO}:${BUILDKITE_COMMIT} \ | ||
| $( [[ "${BRANCH}" == "main" ]] && echo "--tag ${REGISTRY}/${REPO}:latest" ) \ | ||
| --push \ | ||
| --target test \ | ||
| --progress plain . |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| group: Abuild | ||
| steps: | ||
| - label: ":docker: Build image" | ||
| key: image-build | ||
| depends_on: [] | ||
| commands: | ||
| - .buildkite/image_build/image_build.sh $REGISTRY $REPO $BUILDKITE_COMMIT $BRANCH $VLLM_USE_PRECOMPILED $VLLM_MERGE_BASE_COMMIT $CACHE_FROM $CACHE_TO | ||
| env: | ||
| DOCKER_BUILDKIT: "1" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can be removed for default docker build since we switched to buildx |
||
| retry: | ||
| automatic: | ||
| - exit_status: -1 # Agent was lost | ||
| limit: 2 | ||
| - exit_status: -10 # Agent was lost | ||
| limit: 2 | ||
|
|
||
| - label: ":docker: Build CPU image" | ||
| key: image-build-cpu | ||
| depends_on: [] | ||
| commands: | ||
| - .buildkite/image_build/image_build_cpu.sh $REGISTRY $REPO $BUILDKITE_COMMIT | ||
| env: | ||
| DOCKER_BUILDKIT: "1" | ||
| retry: | ||
| automatic: | ||
| - exit_status: -1 # Agent was lost | ||
| limit: 2 | ||
| - exit_status: -10 # Agent was lost | ||
| limit: 2 | ||
|
|
||
| - label: ":docker: Build HPU image" | ||
| soft_fail: true | ||
| depends_on: [] | ||
| key: image-build-hpu | ||
| commands: | ||
| - .buildkite/image_build/image_build_hpu.sh $REGISTRY $REPO $BUILDKITE_COMMIT | ||
| env: | ||
| DOCKER_BUILDKIT: "1" | ||
| retry: | ||
| automatic: | ||
| - exit_status: -1 # Agent was lost | ||
| limit: 2 | ||
| - exit_status: -10 # Agent was lost | ||
| limit: 2 | ||
|
|
||
| - label: ":docker: Build CPU arm64 image" | ||
| key: cpu-arm64-image-build | ||
| depends_on: [] | ||
| optional: true | ||
| commands: | ||
| - .buildkite/image_build/image_build_cpu_arm64.sh $REGISTRY $REPO $BUILDKITE_COMMIT | ||
| env: | ||
| DOCKER_BUILDKIT: "1" | ||
| retry: | ||
| automatic: | ||
| - exit_status: -1 # Agent was lost | ||
| limit: 2 | ||
| - exit_status: -10 # Agent was lost | ||
| limit: 2 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| #!/bin/bash | ||
| set -e | ||
|
|
||
| if [[ $# -lt 3 ]]; then | ||
| echo "Usage: $0 <registry> <repo> <commit>" | ||
| exit 1 | ||
| fi | ||
|
|
||
| REGISTRY=$1 | ||
| REPO=$2 | ||
| BUILDKITE_COMMIT=$3 | ||
|
|
||
| # authenticate with AWS ECR | ||
| aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin $REGISTRY | ||
|
|
||
| # skip build if image already exists | ||
| if [[ -z $(docker manifest inspect $REGISTRY/$REPO:$BUILDKITE_COMMIT-cpu) ]]; then | ||
| echo "Image not found, proceeding with build..." | ||
| else | ||
| echo "Image found" | ||
| exit 0 | ||
| fi | ||
|
|
||
| # build | ||
| docker build --file docker/Dockerfile.cpu \ | ||
| --build-arg max_jobs=16 \ | ||
| --build-arg buildkite_commit=$BUILDKITE_COMMIT \ | ||
| --build-arg VLLM_CPU_AVX512BF16=true \ | ||
| --build-arg VLLM_CPU_AVX512VNNI=true \ | ||
| --build-arg VLLM_CPU_AMXBF16=true \ | ||
| --tag $REGISTRY/$REPO:$BUILDKITE_COMMIT-cpu \ | ||
| --target vllm-test \ | ||
| --progress plain . | ||
|
|
||
| # push | ||
| docker push $REGISTRY/$REPO:$BUILDKITE_COMMIT-cpu |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| #!/bin/bash | ||
| set -e | ||
|
|
||
| if [[ $# -lt 3 ]]; then | ||
| echo "Usage: $0 <registry> <repo> <commit>" | ||
| exit 1 | ||
| fi | ||
|
|
||
| REGISTRY=$1 | ||
| REPO=$2 | ||
| BUILDKITE_COMMIT=$3 | ||
|
|
||
| # authenticate with AWS ECR | ||
| aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin $REGISTRY | ||
|
|
||
| # skip build if image already exists | ||
| if [[ -z $(docker manifest inspect $REGISTRY/$REPO:$BUILDKITE_COMMIT-cpu) ]]; then | ||
| echo "Image not found, proceeding with build..." | ||
| else | ||
| echo "Image found" | ||
| exit 0 | ||
| fi | ||
|
|
||
| # build | ||
| docker build --file docker/Dockerfile.cpu \ | ||
| --build-arg max_jobs=16 \ | ||
| --build-arg buildkite_commit=$BUILDKITE_COMMIT \ | ||
| --tag $REGISTRY/$REPO:$BUILDKITE_COMMIT-cpu \ | ||
| --target vllm-test \ | ||
| --progress plain . | ||
|
|
||
| # push | ||
| docker push $REGISTRY/$REPO:$BUILDKITE_COMMIT-cpu |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| #!/bin/bash | ||
| set -e | ||
|
|
||
| if [[ $# -lt 3 ]]; then | ||
| echo "Usage: $0 <registry> <repo> <commit>" | ||
| exit 1 | ||
| fi | ||
|
|
||
| REGISTRY=$1 | ||
| REPO=$2 | ||
| BUILDKITE_COMMIT=$3 | ||
|
|
||
| # authenticate with AWS ECR | ||
| aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin $REGISTRY | ||
|
|
||
| # skip build if image already exists | ||
| if [[ -z $(docker manifest inspect $REGISTRY/$REPO:$BUILDKITE_COMMIT-hpu) ]]; then | ||
| echo "Image not found, proceeding with build..." | ||
| else | ||
| echo "Image found" | ||
| exit 0 | ||
| fi | ||
|
|
||
| # build | ||
| docker build \ | ||
| --file tests/pytorch_ci_hud_benchmark/Dockerfile.hpu \ | ||
| --build-arg max_jobs=16 \ | ||
| --build-arg buildkite_commit=$BUILDKITE_COMMIT \ | ||
| --tag $REGISTRY/$REPO:$BUILDKITE_COMMIT-hpu \ | ||
| --progress plain \ | ||
| https://github.com/vllm-project/vllm-gaudi.git | ||
|
|
||
| # push | ||
| docker push $REGISTRY/$REPO:$BUILDKITE_COMMIT-hpu |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| group: Attention | ||
| depends_on: | ||
| - image-build | ||
| steps: | ||
| - label: V1 attention (H100) | ||
| timeout_in_minutes: 30 | ||
| gpu: h100 | ||
| source_file_dependencies: | ||
| - vllm/v1/attention | ||
| - tests/v1/attention | ||
| commands: | ||
| - pytest -v -s v1/attention | ||
|
|
||
| - label: V1 attention (B200) | ||
| timeout_in_minutes: 30 | ||
| gpu: b200 | ||
| source_file_dependencies: | ||
| - vllm/v1/attention | ||
| - tests/v1/attention | ||
| commands: | ||
| - VLLM_DISABLE_FLASHINFER_PREFILL=1 pytest -v -s v1/attention # TODO: FI prefill is bugged and causes incorrectness, fix this |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| group: Basic Correctness | ||
| depends_on: | ||
| - image-build | ||
| steps: | ||
| - label: Basic Correctness | ||
| timeout_in_minutes: 30 | ||
| source_file_dependencies: | ||
| - vllm/ | ||
| - tests/basic_correctness/test_basic_correctness | ||
khluu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - tests/basic_correctness/test_cpu_offload | ||
| - tests/basic_correctness/test_cumem.py | ||
| commands: | ||
| - export VLLM_WORKER_MULTIPROC_METHOD=spawn | ||
| - pytest -v -s basic_correctness/test_cumem.py | ||
| - pytest -v -s basic_correctness/test_basic_correctness.py | ||
| - pytest -v -s basic_correctness/test_cpu_offload.py | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| group: Benchmarks | ||
| depends_on: | ||
| - image-build | ||
| steps: | ||
| - label: Benchmarks | ||
| timeout_in_minutes: 20 | ||
| working_dir: "/vllm-workspace/.buildkite" | ||
| source_file_dependencies: | ||
| - benchmarks/ | ||
| commands: | ||
| - bash scripts/run-benchmarks.sh | ||
|
|
||
| - label: Benchmarks CLI Test | ||
| timeout_in_minutes: 20 | ||
| source_file_dependencies: | ||
| - vllm/ | ||
| - tests/benchmarks/ | ||
| commands: | ||
| - pytest -v -s benchmarks/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,57 @@ | ||
| group: Compile | ||
| depends_on: | ||
| - image-build | ||
| steps: | ||
| - label: Fusion and Compile Tests (B200) | ||
| timeout_in_minutes: 40 | ||
| working_dir: "/vllm-workspace/" | ||
| gpu: b200 | ||
| source_file_dependencies: | ||
| - csrc/quantization/fp4/ | ||
| - vllm/model_executor/layers/quantization/utils/flashinfer_utils.py | ||
| - vllm/v1/attention/backends/flashinfer.py | ||
| - vllm/v1/worker/ | ||
| - vllm/v1/cudagraph_dispatcher.py | ||
| - vllm/compilation/ | ||
| # can affect pattern matching | ||
| - vllm/model_executor/layers/layernorm.py | ||
| - vllm/model_executor/layers/activation.py | ||
| - vllm/model_executor/layers/quantization/input_quant_fp8.py | ||
| - tests/compile/test_fusion_attn.py | ||
| - tests/compile/test_silu_mul_quant_fusion.py | ||
| - tests/compile/distributed/test_fusion_all_reduce.py | ||
| - tests/compile/distributed/test_fusions_e2e.py | ||
| - tests/compile/fullgraph/test_full_graph.py | ||
| commands: | ||
| - nvidia-smi | ||
| - pytest -v -s tests/compile/test_fusion_attn.py | ||
| - pytest -v -s tests/compile/test_silu_mul_quant_fusion.py | ||
| # this runner has 2 GPUs available even though num_gpus=2 is not set | ||
| - pytest -v -s tests/compile/distributed/test_fusion_all_reduce.py | ||
| # Limit to Inductor partition, no custom ops, and allreduce & attn fusion to reduce running time | ||
| # Wrap with quotes to escape yaml | ||
| - "pytest -v -s tests/compile/distributed/test_fusions_e2e.py::test_tp2_attn_quant_allreduce_rmsnorm -k 'True and not +quant_fp8 and not +rms_norm'" | ||
| # test_fp8_kv_scale_compile requires FlashAttention (not supported on default L4/L40) | ||
| - pytest -v -s tests/compile/fullgraph/test_full_graph.py::test_fp8_kv_scale_compile | ||
|
|
||
| - label: Fusion E2E (2 GPUs)(B200) | ||
| timeout_in_minutes: 40 | ||
| working_dir: "/vllm-workspace/" | ||
| gpu: b200 | ||
| optional: true | ||
| num_gpus: 2 | ||
| source_file_dependencies: | ||
| - csrc/quantization/fp4/ | ||
| - vllm/model_executor/layers/quantization/utils/flashinfer_utils.py | ||
| - vllm/v1/attention/backends/flashinfer.py | ||
| - vllm/compilation/ | ||
| # can affect pattern matching | ||
| - vllm/model_executor/layers/layernorm.py | ||
| - vllm/model_executor/layers/activation.py | ||
| - vllm/model_executor/layers/quantization/input_quant_fp8.py | ||
| - tests/compile/distributed/test_fusions_e2e.py | ||
| commands: | ||
| - nvidia-smi | ||
| # Run all e2e fusion tests | ||
| - pytest -v -s tests/compile/distributed/test_fusions_e2e.py | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| group: CUDA | ||
| depends_on: | ||
| - image-build | ||
| steps: | ||
| - label: Platform Tests (CUDA) | ||
| timeout_in_minutes: 15 | ||
| source_file_dependencies: | ||
| - vllm/ | ||
| - tests/cuda | ||
| commands: | ||
| - pytest -v -s cuda/test_cuda_context.py | ||
|
|
||
| - label: Cudagraph | ||
| timeout_in_minutes: 20 | ||
| source_file_dependencies: | ||
| - tests/v1/cudagraph | ||
| - vllm/v1/cudagraph_dispatcher.py | ||
| - vllm/config/compilation.py | ||
| - vllm/compilation | ||
| commands: | ||
| - pytest -v -s v1/cudagraph/test_cudagraph_dispatch.py | ||
| - pytest -v -s v1/cudagraph/test_cudagraph_mode.py |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's intentional so this group shows up at the top.. I will fix it with a proper
priority: 1ororder: 1key in the future