Skip to content

Use of pretrained_window_size #385

@abhiagwl4262

Description

@abhiagwl4262

I am confused about the use of pretrained_window_size in WindowAttention Class.

I got the sense that in SwinV2 we can use Window-Size freely. So If I pre-train my model with ImageSize of 256 and Window-Size of 8 while I fine-tune a downstream task with ImageSize of 512 and Window-Size of 32, there should not be any need for handling the Continuous Relative Position Bias like we do interpolation etc in ViT.

But still we are doing this -

        if pretrained_window_size[0] > 0:
            relative_coords_table[:, :, :, 0] /= (pretrained_window_size[0] - 1)
            relative_coords_table[:, :, :, 1] /= (pretrained_window_size[1] - 1)
        else:
            relative_coords_table[:, :, :, 0] /= (self.window_size[0] - 1)
            relative_coords_table[:, :, :, 1] /= (self.window_size[1] - 1)

Also this is not being done in Torchvision's Implementation of SwinV2. check Here("https://github.com/pytorch/vision/blob/main/torchvision/models/swin_transformer.py#L351-L365"). Also there is not such parameter as pretrained_window_size as well

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions