Training Script Stuck at 0% with High GPU Utilization

While running train.py on a specific [dataset](https://drive.google.com/drive/folders/1B4Uay_L4aPZufv5KZJZ-vwdbmhRyZi8R?usp=drive_link), the training process gets stuck at 0% progress. After investigating with the Python Debugger, I found that the script runs extremely slowly and hangs at the following line in mip-splatting/train.py:

`ema_loss_for_log = 0.4 * loss.item() + 0.6 * ema_loss_for_log  # Line 143`

The GPU is consistently at full load during this time, but no progress is made. I’m unable to determine the root cause of this issue and would appreciate any guidance or suggestions.

Environment Details:

OS: WSL Ubuntu 22.04

CUDA Version: 11.8 (Cuda compilation tools, release 11.8, V11.8.89)

Thank you in advance for your help!

![Image](https://github.com/user-attachments/assets/7c84fa16-ae51-4e2b-a48c-5fb2ec567362)

![Image](https://github.com/user-attachments/assets/711b84f7-ea77-4f9d-8034-13fa9b6e7e71)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Training Script Stuck at 0% with High GPU Utilization #68

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Training Script Stuck at 0% with High GPU Utilization #68

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions