Skip to content

Add support for configurable routers and gateways #12

@anfredette

Description

@anfredette

The current deployment architecture assumes a specific routing/gateway setup. We should support multiple router/gateway options to match different production environments.

Acceptance Criteria

  • Identify common router/gateway options (e.g., Istio, NGINX, Envoy, KServe predictor)
  • Define configuration schema for router selection
  • Create deployment templates for each router type
  • Add UI option to select router/gateway during deployment
  • Document routing architecture for each option
  • Test with KV cache-aware routing (see item Add llm-d as deployment target alongside KServe/vLLM #11 in work items)

Notes

  • Routing impacts latency, throughput, and KV cache efficiency
  • Should support KV cache-aware routing for multi-replica deployments (llm-d feature)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions