When I use the new version, there's some problems. #68

AIikai · 2020-09-15T00:40:11Z

'''
WARNING:root:bert_config not exists. will load model from huggingface checkpoint.
Traceback (most recent call last):
File "run_weibo_ner_cws.py", line 31, in
train_bert_multitask(problem='weibo_ner&weibo_cws', params=params, problem_type_dict=problem_type_dict,
File "/data/home/likai/.conda/envs/lkai_tf2/lib/python3.8/site-packages/bert_multitask_learning/run_bert_multitask.py", line 113, in train_bert_multitask
params.assign_problem(problem, gpu=int(num_gpus),
File "/data/home/likai/.conda/envs/lkai_tf2/lib/python3.8/site-packages/bert_multitask_learning/params.py", line 221, in assign_problem
self.prepare_dir(base_dir, dir_name, self.problem_list)
File "/data/home/likai/.conda/envs/lkai_tf2/lib/python3.8/site-packages/bert_multitask_learning/params.py", line 491, in prepare_dir
tokenizer = load_transformer_tokenizer(
File "/data/home/likai/.conda/envs/lkai_tf2/lib/python3.8/site-packages/bert_multitask_learning/utils.py", line 278, in load_transformer_tokenizer
tok = getattr(transformers, load_module_name).from_pretrained(
File "/data/home/likai/.conda/envs/lkai_tf2/lib/python3.8/site-packages/transformers/tokenization_auto.py", line 188, in from_pretrained
config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
File "/data/home/likai/.conda/envs/lkai_tf2/lib/python3.8/site-packages/transformers/configuration_auto.py", line 289, in from_pretrained
raise ValueError(
ValueError: Unrecognized model in models/weibo_cws_weibo_ner_ckpt/tokenizer. Should have a model_type key in its config.json, or contain one of the following strings in its name: retribert, t5, mobilebert, distilbert, albert, bert-generation, camembert, xlm-roberta, pegasus, marian, mbart, bart, reformer, longformer, roberta, flaubert, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm, ctrl, electra, encoder-decoder, funnel, lxmert
'''
The file-'config.json' under this path -'models/weibo_cws_weibo_ner_ckpt/tokenizer' is updated each time I run the program, and there is no model_type. Do you know what the problem is？Expect a response, thank you.

The text was updated successfully, but these errors were encountered:

JayYip · 2020-09-15T07:54:52Z

This is a known issue with huggingface AutoTokenizer. You have to specify a specific tokenizer loading module:

from bert_multitask_learning import DynamicBatchSizeParams

params = DynamicBatchSizeParams()
params.transformer_tokenizer_loading = 'BertTokenizer'

# then pass the param to train_bert_multitask

This is not a very good solution but does the job for now.

AIikai · 2020-09-16T13:52:13Z

Thank you. But when I made the prediction, there was another problem.
'''
Traceback (most recent call last):
File "run_ner_sim_new.py", line 105, in
pred_prob = predict_bert_multitask(
File "/data/home/likai/.conda/envs/lkai_tf2/lib/python3.8/site-packages/bert_multitask_learning/run_bert_multitask.py", line 247, in predict_bert_multitask
pred_dataset = predict_input_fn(inputs, params)
File "/data/home/likai/.conda/envs/lkai_tf2/lib/python3.8/site-packages/bert_multitask_learning/input_fn.py", line 128, in predict_input_fn
first_dict = next(part_fn(example_list=tmp_iter))
StopIteration
...
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
'''
My inputs here is a list -['李白有哪些作品[SEP]诗词作品']，and the problem is ''msra_ner|sp_cls'. It contains a self-defined problem of sentence-pair classification.

JayYip · 2020-09-16T14:18:08Z

Oh, this is a bug that will be triggered when the length of prediction input list is 1. Please try repeating the input list like ['李白有哪些作品[SEP]诗词作品']*5 to bypass it. I'll fix it later.

AIikai · 2020-09-17T01:19:23Z

Oh, this is a bug that will be triggered when the length of prediction input list is 1. Please try repeating the input list like ['李白有哪些作品[SEP]诗词作品']*5 to bypass it. I'll fix it later.

Thank you for the help. I will try again.

AIikai · 2020-09-21T13:43:47Z

Another question:
Epoch 1/5 278/278 [==============================] - 104s 376ms/step - sp_cls_acc: 0.8215 - kg_ner_loss: 0.4776 - sp_cls_loss: 0.0793 - val_loss: 0.0456 - val_sp_cls_acc: 0.8332ted/_251 .... Epoch 5/5 278/278 [==============================] - ETA: 0s - sp_cls_acc: 0.8307 - kg_ner_loss: 0.1234 - sp_cls_loss: 0.0127 - val_loss: 0.0433 - val_sp_cls_acc: 0.8326

It is a Joint model of NER and sentence-pair classification. But when I was training the model, it wasn't show NER's accuracy, and from some examples, it didn't do very well.

JayYip · 2020-09-22T04:18:12Z

Currently, only cls problem supports calculating accuracy. But implementing accuracy for seq_tag shouldn't be too difficult. Contribution is welcomed. You can start here.

As for the performance issue, I'll take some time to investigate. I wonder if it is related to huggingface transformers since the top layer is relatively simple.

AIikai · 2020-09-22T05:59:29Z

Currently, only cls problem supports calculating accuracy. But implementing accuracy for seq_tag shouldn't be too difficult. Contribution is welcomed. You can start here.

As for the performance issue, I'll take some time to investigate. I wonder if it is related to huggingface transformers since the top layer is relatively simple.

Ok, thank U.

JayYip · 2020-09-23T04:53:53Z

I removed a padding logic which could potentially cause a performance drop for sequence labeling, please install the latest code and give another shot.

JayYip · 2020-09-23T06:04:56Z

I have added the acc metric to seq_tag problem. BTW, I trained this ner problem for 3 epochs and judging from the results, the model is learning.

AIikai · 2020-09-23T09:48:16Z

Bravo, the problem has been solved. Thank U.

AIikai · 2020-09-23T12:26:00Z

Excuse me, I have a problem of sentence-pair classification. When I trained it as a single task, it had a high accuracy rate, but when I trained it as one of multi-task, it had a low accuracy rate, and the accuracy was always 0.8332. I'm puzzled by the result. Do you know the reason? Expect a response, thank you.

---- Multi-task ----
Epoch 1/10 759/759 [==============================] - 266s 351ms/step - sp_cls_acc: 0.8065 - sp_cls_loss: 0.0703 - msra_ner_acc: 0.8607 - msra_ner_loss: 0.3831 - val_sp_cls_acc: 0.8332 - val_msra_ner_acc: 0.9244 - val_loss: 0.1577 ... Epoch 10/10 759/759 [==============================] - 367s 484ms/step - sp_cls_acc: 0.8329 - sp_cls_loss: 0.0045 - msra_ner_acc: 0.9530 - msra_ner_loss: 0.0903 - val_sp_cls_acc: 0.8332 - val_msra_ner_acc: 0.9523 - val_loss: 0.0793
---- Single-task ----
Epoch 1/5 104/104 [==============================] - 44s 422ms/step - sp_cls_acc: 0.8311 - sp_cls_loss: 0.4513 - val_sp_cls_acc: 0.9150 - val_loss: 0.2207 ... Epoch 5/5 104/104 [==============================] - 60s 580ms/step - sp_cls_acc: 0.9705 - sp_cls_loss: 0.0928 - val_sp_cls_acc: 0.9841 - val_loss: 0.0522

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When I use the new version, there's some problems. #68

When I use the new version, there's some problems. #68

AIikai commented Sep 15, 2020

JayYip commented Sep 15, 2020 •

edited

Loading

AIikai commented Sep 16, 2020

JayYip commented Sep 16, 2020

AIikai commented Sep 17, 2020

AIikai commented Sep 21, 2020

JayYip commented Sep 22, 2020 •

edited

Loading

AIikai commented Sep 22, 2020

JayYip commented Sep 23, 2020

JayYip commented Sep 23, 2020

AIikai commented Sep 23, 2020 •

edited

Loading

AIikai commented Sep 23, 2020

When I use the new version, there's some problems. #68

When I use the new version, there's some problems. #68

Comments

AIikai commented Sep 15, 2020

JayYip commented Sep 15, 2020 • edited Loading

AIikai commented Sep 16, 2020

JayYip commented Sep 16, 2020

AIikai commented Sep 17, 2020

AIikai commented Sep 21, 2020

JayYip commented Sep 22, 2020 • edited Loading

AIikai commented Sep 22, 2020

JayYip commented Sep 23, 2020

JayYip commented Sep 23, 2020

AIikai commented Sep 23, 2020 • edited Loading

AIikai commented Sep 23, 2020

JayYip commented Sep 15, 2020 •

edited

Loading

JayYip commented Sep 22, 2020 •

edited

Loading

AIikai commented Sep 23, 2020 •

edited

Loading