Skip to content

CLUE命名实体识别

zhezhaoa edited this page Aug 15, 2023 · 7 revisions

以下是CLUE命名实体识别解决方案的简要介绍。

CLUENER2020

利用google_zh_model.bin在CLUENER2020数据集上做微调和预测示例:

python3 finetune/run_ner.py --pretrained_model_path models/google_zh_model.bin \
                            --vocab_path models/google_zh_vocab.txt \
                            --config_path models/bert/base_config.json \
                            --train_path datasets/cluener2020/train.tsv \
                            --dev_path datasets/cluener2020/dev.tsv \
                            --label2id_path datasets/cluener2020/label2id.json \
                            --output_model_path models/ner_model.bin \
                            --epochs_num 5 --batch_size 16

python3 inference/run_ner_infer.py --load_model_path models/ner_model.bin \
                                   --vocab_path models/google_zh_vocab.txt \
                                   --config_path models/bert/base_config.json \
                                   --test_path datasets/cluener2020/test_nolabel.tsv \
                                   --prediction_path datasets/cluener2020/prediction.tsv \
                                   --label2id_path datasets/cluener2020/label2id.json

利用cluecorpussmall_roberta_wwm_large_seq512_model.bin在CLUENER2020数据集上做微调和预测示例:

python3 finetune/run_ner.py --pretrained_model_path models/cluecorpussmall_roberta_wwm_large_seq512_model.bin \
                            --vocab_path models/google_zh_vocab.txt \
                            --config_path models/bert/large_config.json \
                            --train_path datasets/cluener2020/train.tsv \
                            --dev_path datasets/cluener2020/dev.tsv \
                            --output_model_path models/ner_model.bin \
                            --label2id_path datasets/cluener2020/label2id.json \
                            --epochs_num 5 --batch_size 16

python3 inference/run_ner_infer.py --load_model_path models/ner_model.bin \
                                   --vocab_path models/google_zh_vocab.txt \
                                   --config_path models/bert/large_config.json \
                                   --test_path datasets/cluener2020/test_nolabel.tsv \
                                   --prediction_path datasets/cluener2020/prediction.tsv \
                                   --label2id_path datasets/cluener2020/label2id.json
Clone this wiki locally