Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions examples/aws/eks-private/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,17 @@ terraform apply
If you are using a `tfvars` file, you will need to update the above commands accordingly.
Note the output from Terraform which includes an example cloud registration command you will use below.

#### Using Additional GPU Instance Types

The default configuration includes only T4. To use additional GPU types (A10G, L4, etc.), use the provided example file which defines additional GPU configurations:

```shell
terraform plan -var-file="gpu_instances.tfvars.example"
terraform apply -var-file="gpu_instances.tfvars.example"
```

Node groups will be created automatically for each GPU type defined in `gpu_instance_types`.

### Install the Kubernetes Requirements

The Anyscale Operator requires the following components:
Expand Down
21 changes: 6 additions & 15 deletions examples/aws/eks-private/eks.tf
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,7 @@ locals {
)

# Map of GPU types to their product names and instance types
gpu_types = {
"T4" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.4xlarge"]
}
"A10G" = {
product_name = "NVIDIA-A10G"
instance_types = ["g5.4xlarge"]
}
}
gpu_types = var.gpu_instance_types

# Base configuration for GPU node groups
gpu_node_group_base = {
Expand Down Expand Up @@ -76,9 +67,9 @@ locals {
}
])

# Create a map of GPU node groups based on node_group_gpu_types
# Create a map of GPU node groups based on gpu_instance_types
gpu_node_groups = {
for gpu_type in var.node_group_gpu_types : gpu_type => {
for gpu_type in keys(var.gpu_instance_types) : gpu_type => {
ondemand = merge(
local.gpu_node_group_base,
{
Expand Down Expand Up @@ -222,12 +213,12 @@ module "eks" {
iam_role_additional_policies = local.anyscale_iam
}
},
# Merge in GPU node groups based on node_group_gpu_types
# Merge in GPU node groups based on gpu_instance_types
{
for gpu_type in var.node_group_gpu_types : "ondemand_gpu_${lower(gpu_type)}" => local.gpu_node_groups[gpu_type].ondemand
for gpu_type in keys(var.gpu_instance_types) : "ondemand_gpu_${lower(gpu_type)}" => local.gpu_node_groups[gpu_type].ondemand
},
{
for gpu_type in var.node_group_gpu_types : "spot_gpu_${lower(gpu_type)}" => local.gpu_node_groups[gpu_type].spot
for gpu_type in keys(var.gpu_instance_types) : "spot_gpu_${lower(gpu_type)}" => local.gpu_node_groups[gpu_type].spot
}
)

Expand Down
31 changes: 31 additions & 0 deletions examples/aws/eks-private/gpu_instances.tfvars.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# GPU Instance Types for EKS
#
# This file contains additional GPU instance configurations beyond the default (T4).
# Use this file directly or copy and modify as needed:
#
# terraform plan -var-file="gpu_instances.tfvars.example"
# terraform apply -var-file="gpu_instances.tfvars.example"

# GPU types configuration - node groups will be created for each entry
gpu_instance_types = {
"T4" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.xlarge", "g4dn.2xlarge", "g4dn.4xlarge"]
}
"T4-4x" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.12xlarge"]
}
"A10G" = {
product_name = "NVIDIA-A10G"
instance_types = ["g5.4xlarge"]
}
"L4" = {
product_name = "NVIDIA-L4"
instance_types = ["g6.2xlarge", "g6.4xlarge"]
}
"L4-4x" = {
product_name = "NVIDIA-L4"
instance_types = ["g6.24xlarge"]
}
}
32 changes: 27 additions & 5 deletions examples/aws/eks-private/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -78,13 +78,35 @@ variable "eks_cluster_version" {
default = "1.32"
}

variable "node_group_gpu_types" {
variable "gpu_instance_types" {
description = <<-EOT
(Optional) The GPU types of the EKS nodes.
Possible values: ["T4", "A10G"]
(Optional) GPU types configuration for the EKS cluster.
See gpu_instances.tfvars.example for additional GPU types.

ex:
```
gpu_instance_types = {
"T4" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.xlarge", "g4dn.2xlarge", "g4dn.4xlarge"]
}
"A10G" = {
product_name = "NVIDIA-A10G"
instance_types = ["g5.4xlarge"]
}
}
```
EOT
type = list(string)
default = ["T4"]
type = map(object({
product_name = string
instance_types = list(string)
}))
default = {
"T4" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.4xlarge"]
}
}
}

variable "enable_efs" {
Expand Down
11 changes: 11 additions & 0 deletions examples/aws/eks-public/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,17 @@ terraform apply
If you are using a `tfvars` file, you will need to update the above commands accordingly.
Note the output from Terraform which includes an example cloud registration command you will use below.

#### Using Additional GPU Instance Types

The default configuration includes only T4. To use additional GPU types (A10G, L4, etc.), use the provided example file which defines additional GPU configurations:

```shell
terraform plan -var-file="gpu_instances.tfvars.example"
terraform apply -var-file="gpu_instances.tfvars.example"
```

Node groups will be created automatically for each GPU type defined in `gpu_instance_types`.

### Install the Kubernetes Requirements

The Anyscale Operator requires the following components:
Expand Down
21 changes: 6 additions & 15 deletions examples/aws/eks-public/eks.tf
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,7 @@ locals {
)

# Map of GPU types to their product names and instance types
gpu_types = {
"T4" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.4xlarge"]
}
"A10G" = {
product_name = "NVIDIA-A10G"
instance_types = ["g5.4xlarge"]
}
}
gpu_types = var.gpu_instance_types

# Base configuration for GPU node groups
gpu_node_group_base = {
Expand Down Expand Up @@ -76,9 +67,9 @@ locals {
}
])

# Create a map of GPU node groups based on node_group_gpu_types
# Create a map of GPU node groups based on gpu_instance_types
gpu_node_groups = {
for gpu_type in var.node_group_gpu_types : gpu_type => {
for gpu_type in keys(var.gpu_instance_types) : gpu_type => {
ondemand = merge(
local.gpu_node_group_base,
{
Expand Down Expand Up @@ -222,12 +213,12 @@ module "eks" {
iam_role_additional_policies = local.anyscale_iam
}
},
# Merge in GPU node groups based on node_group_gpu_types
# Merge in GPU node groups based on gpu_instance_types
{
for gpu_type in var.node_group_gpu_types : "ondemand_gpu_${lower(gpu_type)}" => local.gpu_node_groups[gpu_type].ondemand
for gpu_type in keys(var.gpu_instance_types) : "ondemand_gpu_${lower(gpu_type)}" => local.gpu_node_groups[gpu_type].ondemand
},
{
for gpu_type in var.node_group_gpu_types : "spot_gpu_${lower(gpu_type)}" => local.gpu_node_groups[gpu_type].spot
for gpu_type in keys(var.gpu_instance_types) : "spot_gpu_${lower(gpu_type)}" => local.gpu_node_groups[gpu_type].spot
}
)

Expand Down
31 changes: 31 additions & 0 deletions examples/aws/eks-public/gpu_instances.tfvars.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# GPU Instance Types for EKS
#
# This file contains additional GPU instance configurations beyond the default (T4).
# Use this file directly or copy and modify as needed:
#
# terraform plan -var-file="gpu_instances.tfvars.example"
# terraform apply -var-file="gpu_instances.tfvars.example"

# GPU types configuration - node groups will be created for each entry
gpu_instance_types = {
"T4" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.xlarge", "g4dn.2xlarge", "g4dn.4xlarge"]
}
"T4-4x" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.12xlarge"]
}
"A10G" = {
product_name = "NVIDIA-A10G"
instance_types = ["g5.4xlarge"]
}
"L4" = {
product_name = "NVIDIA-L4"
instance_types = ["g6.2xlarge", "g6.4xlarge"]
}
"L4-4x" = {
product_name = "NVIDIA-L4"
instance_types = ["g6.24xlarge"]
}
}
32 changes: 27 additions & 5 deletions examples/aws/eks-public/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -78,13 +78,35 @@ variable "eks_cluster_version" {
default = "1.32"
}

variable "node_group_gpu_types" {
variable "gpu_instance_types" {
description = <<-EOT
(Optional) The GPU types of the EKS nodes.
Possible values: ["T4", "A10G"]
(Optional) GPU types configuration for the EKS cluster.
See gpu_instances.tfvars.example for additional GPU types.

ex:
```
gpu_instance_types = {
"T4" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.xlarge", "g4dn.2xlarge", "g4dn.4xlarge"]
}
"A10G" = {
product_name = "NVIDIA-A10G"
instance_types = ["g5.4xlarge"]
}
}
```
EOT
type = list(string)
default = ["T4"]
type = map(object({
product_name = string
instance_types = list(string)
}))
default = {
"T4" = {
product_name = "Tesla-T4"
instance_types = ["g4dn.4xlarge"]
}
}
}

variable "enable_efs" {
Expand Down
11 changes: 11 additions & 0 deletions examples/gcp/gke-new_cluster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,17 @@ Steps for deploying Anyscale resources via Terraform:
If you are using a `tfvars` file, you will need to update the above commands accordingly.
Note the output from Terraform which includes an example cloud registration command you will use below.

#### Using Additional GPU Instance Types

The default configuration includes only T4. To use additional GPU types (L4, A100, H100, etc.), use the provided example file which defines additional GPU configurations:

```shell
terraform plan -var-file="gpu_instances.tfvars.example"
terraform apply -var-file="gpu_instances.tfvars.example"
```

Node pools will be created automatically for each GPU type defined in `gpu_instance_configs`.

### Install the Kubernetes Requirements

The Anyscale Operator requires the following components:
Expand Down
Loading