Skip to content

eager版GPT复现中的问题 #198

@dangkai4u

Description

@dangkai4u

关于gelu函数的细微不一致

>>> import torch
>>> import oneflow as flow

>>> def pt_gelu(x):
>>>     """
>>>     Implementation of the GELU activation function currently in Google BERT repo (identical to OpenAI GPT). Also see
>>>     the Gaussian Error Linear Units paper: https://arxiv.org/abs/1606.08415
>>>     """
>>>     return 0.5 * x * (1.0 + torch.tanh(math.sqrt(2.0 / math.pi) * (x + 0.044715 * torch.pow(x, 3.0))))

>>> gelu = flow.nn.GELU()
>>> x = np.array([-5, 10, 105]).astype(np.float32)

>>> pt_input = torch.from_numpy(x)
>>> of_input = flow.Tensor(x)

>>> print(pt_gelu(pt_input))
tensor([-2.9802e-07,  1.0000e+01,  1.0500e+02])

>>> print(gelu(of_input))
tensor([-1.4901e-06,  1.0000e+01,  1.0500e+02], dtype=oneflow.float32)

>>> print(torch.nn.GELU(pt_input))
tensor([-1.4333e-06, 1.0000e+01,  1.0500e+02])

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions