Nowhere on the paper or code where it mentions what Dropout prob and momentum value used for SGD when fine tuning on ImageNet1k. Is is the same as https://arxiv.org/pdf/2010.11929.pdf ? Also can you provide the code for imagenet1k. I would like to see how the images are normalized for that stage?