基于 BERT 的 NER 模型在反序列化时给出不一致的预测 [英] BERT-based NER model giving inconsistent prediction when deserialized

查看:42
本文介绍了基于 BERT 的 NER 模型在反序列化时给出不一致的预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 Colab 云 GPU 上使用 HuggingFace 转换器库训练一个 NER 模型,对其进行腌制并将模型加载到我自己的 CPU 上以进行预测.

I am trying to train an NER model using the HuggingFace transformers library on Colab cloud GPUs, pickle it and load the model on my own CPU to make predictions.

代码

模型如下:

from transformers import BertForTokenClassification

model = BertForTokenClassification.from_pretrained(
    "bert-base-cased",
    num_labels=NUM_LABELS,
    output_attentions = False,
    output_hidden_states = False
)

我正在使用此代码段在 Colab 上保存模型

I am using this snippet to save the model on Colab

import torch

torch.save(model.state_dict(), FILENAME)

然后使用

# Initiating an instance of the model type

model_reload = BertForTokenClassification.from_pretrained(
    "bert-base-cased",
    num_labels=len(tag2idx),
    output_attentions = False,
    output_hidden_states = False
)

# Loading the model
model_reload.load_state_dict(torch.load(FILENAME, map_location='cpu'))
model_reload.eval()

用于标记文本和进行实际预测的代码片段在 Colab GPU 笔记本实例和我的 CPU 笔记本实例上是相同的.

The code snippet used to tokenize the text and make actual predictions is the same both on the Colab GPU notebook instance and my CPU notebook instance.

预期行为

经过 GPU 训练的模型表现正确,并完美地对以下标记进行了分类:

The GPU-trained model behaves correctly and classifies the following tokens perfectly:

O       [CLS]
O       Good
O       morning
O       ,
O       my
O       name
O       is
B-per   John
I-per   Kennedy
O       and
O       I
O       am
O       working
O       at
B-org   Apple
O       in
O       the
O       headquarters
O       of
B-geo   Cupertino
O       [SEP]

实际行为

加载模型并使用它在我的 CPU 上进行预测时,预测完全错误:

When loading the model and use it to make predictions on my CPU, the predictions are totally wrong:

I-eve   [CLS]
I-eve   Good
I-eve   morning
I-eve   ,
I-eve   my
I-eve   name
I-eve   is
I-geo   John
B-eve   Kennedy
I-eve   and
I-eve   I
I-eve   am
I-eve   working
I-eve   at
I-gpe   Apple
I-eve   in
I-eve   the
I-eve   headquarters
I-eve   of
B-org   Cupertino
I-eve   [SEP]

有人知道为什么它不起作用吗?我错过了什么吗?

Does anyone have ideas why it doesn't work? Did I miss something?

推荐答案

我修复了,有两个问题:

I fixed it, there were two problems:

  1. token 的索引标签映射是错误的,出于某种原因,list() 函数在 Colab GPU 上的工作方式与我的 CPU 上的不同 (??)

  1. The index-label mapping for tokens was wrong, for some reason the list() function worked differently on Colab GPU than my CPU (??)

用于保存模型的代码片段不正确,对于基于huggingface-transformers库的模型你不能使用model.save_dict()并稍后加载它,你需要使用save_pretrained()方法模型类,稍后使用 from_pretrained() 加载它.

The snippet used to save the model was not correct, for models based on the huggingface-transformers library you can't use model.save_dict() and load it later, you need to use the save_pretrained() method of your model class, and load it later using from_pretrained().

这篇关于基于 BERT 的 NER 模型在反序列化时给出不一致的预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆