Skip to content

[bug] Torch does not find GPU on pytorch-training:1.10.0-gpu-py38 container  #1738

@sergii-ivakhno-kidsloop

Description

@sergii-ivakhno-kidsloop

Concise Description:

Torch does not find Cuda on GPU instance and official SageMaker training container

DLC image/dockerfile:

sudo docker pull 763104351884.dkr.ecr.eu-west-2.amazonaws.com/pytorch-training:1.10.0-gpu-py38-cu113-ubuntu20.04-sagemaker

Current behavior:

sudo docker pull 763104351884.dkr.ecr.eu-west-2.amazonaws.com/pytorch-training:1.10.0-gpu-py38-cu113-ubuntu20.04-sagemaker
sudo docker run -it --entrypoint /bin/bash 709fa9395949
python -c "import torch; print(torch.cuda.is_available()) -> False"

Expected behavior:

python -c "import torch; print(torch.cuda.is_available())" -> True

Additional context:

The same outcome is seen on SageMaker Notebook instance ml.p3.2xlarge (docker pull from console) and EC2 instance p3.2xlarge

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions