There is no examples on finetuning with a pretokenized dataset. The only thing mentioned in the doc is:Columns in Dataset must be exactly input_ids, attention_mask, labels. But that raises these quetions: Should the values be pre-padded? What should be the ignore index for the labels? Ad...
ValueError: text input must of type str (single example), List[str] (batch or single pretokenized example) or List[List[str]] (batch of pretokenized examples). #39 Open Ushanjay opened this issue Jul 21, 2022· 1 comment Comments Ushanjay commented Jul 21, 2022 import soundfile ...
ValueError: text input must of typestr(single example),List[str](batch or single pretokenized example) orList[List[str]](batch of pretokenized examples). My dataset looks like this: source_texttarget_text paraphrase: Im Oktober 1560 traf er sich heiml...Im Oktober 1560 traf er sich in ...