Use of pretrained_window_size

I am confused about the use of pretrained_window_size in WindowAttention Class. 

I got the sense that in SwinV2 we can use Window-Size freely. So If I pre-train my model with ImageSize of 256 and Window-Size of 8 while I fine-tune a downstream task with ImageSize of 512 and Window-Size of 32, there should not be any need for handling the Continuous Relative Position Bias like we do interpolation etc in ViT. 

But still we are doing this - 

```
        if pretrained_window_size[0] > 0:
            relative_coords_table[:, :, :, 0] /= (pretrained_window_size[0] - 1)
            relative_coords_table[:, :, :, 1] /= (pretrained_window_size[1] - 1)
        else:
            relative_coords_table[:, :, :, 0] /= (self.window_size[0] - 1)
            relative_coords_table[:, :, :, 1] /= (self.window_size[1] - 1)

```
Also this is not being done in Torchvision's Implementation of SwinV2. check Here("https://github.com/pytorch/vision/blob/main/torchvision/models/swin_transformer.py#L351-L365"). Also there is not such parameter as pretrained_window_size as well

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of pretrained_window_size #385

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Use of pretrained_window_size #385

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions