Skip to content

Pleaee @showgood880702 i need this dataset as text2text .json file to understand the structure correctly #941

@antonious-emad

Description

@antonious-emad

{

"input": "###Instruction: ....\n\n###human: ....\n\n###chatbot: ....\n\n###human: ....\n\n###chatbot: ....\n\n###human: .....\n\n###chatbot:",

"output": ".....###"

}

Thank you very much for the explanation.

I am still a little confused about the training data structure for a chatbot. For example, here I have a multi-round conversation used as training data. Should I feed it to the model as I showed before, with the end_mark and the end?

{"input": "###Instruction: ....\n\n###human: ....\n\n###:chatbot:", "output": ".....###"}

{"input": "###Instruction: ....\n\n###human: ....\n\n###:chatbot:", "output": ".....###"}

{"input": "###Instruction: ....\n\n###human: ....\n\n###:chatbot:", "output": ".....###"}

or should I split them as pairs of and as different instances, and start with the instruction?

Originally posted by @showgood880702 in #357

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions