Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

自定义数据集只能用pkl?我记得之前用过csv 近期再用发现提示让我用pkl #68

Open
hdyzhuxun opened this issue May 19, 2021 · 1 comment

Comments

@hdyzhuxun
Copy link

hdyzhuxun commented May 19, 2021

请问哪里更改使用csv格式数据集来训练? 我找了好久没有发现可以改的地方呢
def read_data(cls, input_file,quotechar = None):
"""Reads a tab separated value file."""
if 'pkl' in str(input_file): #pkl 改 csv ??
lines = load_pickle(input_file)
else:
lines = input_file
return lines

run_bert.py 里
`def run_train(args):
# --------- data
processor = BertProcessor(vocab_path=config['bert_vocab_path'], do_lower_case=args.do_lower_case)
label_list = processor.get_labels()
label2id = {label: i for i, label in enumerate(label_list)}
id2label = {i: label for i, label in enumerate(label_list)}

train_data = processor.get_train(config['data_dir'] / f"{args.data_name}.train.csv")
train_examples = processor.create_examples(lines=train_data,
                                           example_type='train',
                                           cached_examples_file=config[
                                                'data_dir'] / f"cached_train_examples_{args.arch}")`

可以解惑一下么

@0ddAstronaut
Copy link

I guess if you input the command python run_bert.py --do_data your .csv files will be automatically converted to .pkl files...?You can refer to the code in the task_data.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants