Skip to content

Add GPU memory requirements to model catalog #8

@anfredette

Description

@anfredette

The model catalog should include GPU memory requirements for each model to support capacity planning and feasibility checks.

Acceptance Criteria

  • Calculate or document GPU memory requirements for each model in the catalog
  • Account for tensor parallelism configurations (e.g., memory per GPU for TP=2)
  • Add gpu_memory_gb field to data/model_catalog.json
  • Update Recommendation Engine to filter out infeasible GPU types based on memory constraints
  • Validate memory estimates against real deployments

Notes

  • Memory requirements vary by model size, quantization, and tensor parallelism
  • This prevents recommending invalid configurations (e.g., 70B model on single L4)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions