[bug] Detectron2 errors when installing on PyTorch DLC

#### Checklist
- [X] I've prepended issue tag with type of change: [bug]
- [X] (If applicable) I've attached the script to reproduce the bug
- [X] (If applicable) I've documented below the DLC image/dockerfile this relates to
- [X] (If applicable) I've documented below the tests I've run on the DLC image
- [X] I'm using an existing DLC image listed here: https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/deep-learning-containers-images.html
- [] I've built my own container based off DLC (and I've attached the code used to build my own image)

### Concise Description:

Detectron2 errors when being installed on top of pytorch-training container. It appears to be related to `smdebug`. 

**How to reproduce:**

```bash
> nvidia-docker run -it 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training:1.10.2-gpu-py38-cu113-ubuntu20.04-sagemaker
> pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
> python -c "from detectron2 import model_zoo"
```

### DLC image/dockerfile:

`763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training:1.10.2-gpu-py38-cu113-ubuntu20.04-sagemaker`

### Current behavior:
Traceback: 
> root@fe0954d71a8e:/# python -c "from detectron2 import model_zoo"
> Traceback (most recent call last):
>   File "<string>", line 1, in <module>
>   File "/opt/conda/lib/python3.8/site-packages/detectron2/model_zoo/__init__.py", line 8, in <module>
>     from .model_zoo import get, get_config_file, get_checkpoint_url, get_config
>   File "/opt/conda/lib/python3.8/site-packages/detectron2/model_zoo/model_zoo.py", line 9, in <module>
>     from detectron2.modeling import build_model
>   File "/opt/conda/lib/python3.8/site-packages/detectron2/modeling/__init__.py", line 2, in <module>
>     from detectron2.layers import ShapeSpec
>   File "/opt/conda/lib/python3.8/site-packages/detectron2/layers/__init__.py", line 2, in <module>
>     from .batch_norm import FrozenBatchNorm2d, get_norm, NaiveSyncBatchNorm, CycleBatchNormList
>   File "/opt/conda/lib/python3.8/site-packages/detectron2/layers/batch_norm.py", line 4, in <module>
>     from fvcore.nn.distributed import differentiable_all_reduce
>   File "/opt/conda/lib/python3.8/site-packages/fvcore/nn/__init__.py", line 4, in <module>
>     from .focal_loss import (
>   File "/opt/conda/lib/python3.8/site-packages/fvcore/nn/focal_loss.py", line 52, in <module>
>     sigmoid_focal_loss_jit: "torch.jit.ScriptModule" = torch.jit.script(sigmoid_focal_loss)
>   File "/opt/conda/lib/python3.8/site-packages/torch/jit/_script.py", line 1310, in script
>     fn = torch._C._jit_script_compile(
>   File "/opt/conda/lib/python3.8/site-packages/torch/jit/_recursive.py", line 838, in try_compile_fn
>     return torch.jit.script(fn, _rcb=rcb)
>   File "/opt/conda/lib/python3.8/site-packages/torch/jit/_script.py", line 1310, in script
>     fn = torch._C._jit_script_compile(
> RuntimeError: 
> undefined value has_torch_function_variadic:
>   File "/opt/conda/lib/python3.8/site-packages/torch/utils/smdebug.py", line 2962
>          >>> loss.backward()
>     """
>     if has_torch_function_variadic(input, target, weight, pos_weight):
>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
>         return handle_torch_function(
>             binary_cross_entropy_with_logits,
> 'binary_cross_entropy_with_logits' is being compiled since it was called from 'sigmoid_focal_loss'
>   File "/opt/conda/lib/python3.8/site-packages/fvcore/nn/focal_loss.py", line 36
>     targets = targets.float()
>     p = torch.sigmoid(inputs)
>     ce_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduction="none")
>     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
>     p_t = p * targets + (1 - p) * (1 - targets)
>     loss = ce_loss * ((1 - p_t) ** gamma) 

### Expected behavior:

No error on import


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bug] Detectron2 errors when installing on PyTorch DLC #1782

Checklist

Concise Description:

DLC image/dockerfile:

Current behavior:

Expected behavior:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[bug] Detectron2 errors when installing on PyTorch DLC #1782

Description

Checklist

Concise Description:

DLC image/dockerfile:

Current behavior:

Expected behavior:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions