在 keras 中使用预训练的 Transformer [英] Using pre-trained transformer with keras

查看:789
本文介绍了在 keras 中使用预训练的 Transformer的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用这个预训练模型:Hate-speech-CNERG/dehatebert-mono-arabic

I want to use this pre-trained model: Hate-speech-CNERG/dehatebert-mono-arabic

我使用此代码使用 Keras(我通常使用的库)构建模型:

I use this code to build model using Keras (the library I generally use):

def build_model(transformer, max_len=512):
    """
    function for training the model
    """
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    sequence_output = transformer(input_word_ids)[0]
    cls_token = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(cls_token)
        
    model = Model(inputs=input_word_ids, outputs=out)
    model.compile(Adam(lr=3e-5), loss='binary_crossentropy',
                  metrics=[tf.keras.metrics.AUC()])
    # changed from 1e-5 to 3e-5
    return model

with strategy.scope():
    model_name = "Hate-speech-CNERG/dehatebert-mono-arabic"
    transformer_layer = (
        transformers.AutoModel.from_pretrained(model_name)
    )
    model = build_model(transformer_layer, max_len=MAX_LEN)

出现以下错误:

AttributeError                            Traceback (most recent call last)
<ipython-input-19-26bbcd63ea51> in <module>()
      5         # .TFAutoModel.from_pretrained('jplu/tf-xlm-roberta-large')
      6     )
----> 7     model = build_model(transformer_layer, max_len=MAX_LEN)

2 frames
/usr/local/lib/python3.7/dist-packages/transformers/models/bert/modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict)
    922             raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time")
    923         elif input_ids is not None:
--> 924             input_shape = input_ids.size()
    925             batch_size, seq_length = input_shape
    926         elif inputs_embeds is not None:
AttributeError: 'KerasTensor' object has no attribute 'size'

推荐答案

来自 Huggingfaces 的模型可以使用 transformers 库开箱即用.它们可以与不同的后端(tensorflow、pytorch)一起使用.

The models from huggingfaces can be used out of the box using the transformers library. They can be used with different backends (tensorflow, pytorch).

此处(在create_model"函数).

Using huggingface's transformers with keras is shown here (in the "create_model" function).

一般来说,您可以使用 模型中的示例代码加载拥抱脸的转换器卡片(在变压器中使用"按钮):

Generally speaking you can load a huggingface's transformer using the example code in the model card (the "use in transformers" button):

from transformers import AutoTokenizer, AutoModelForSequenceClassification
  
tokenizer = AutoTokenizer.from_pretrained("Hate-speech-CNERG/dehatebert-mono-arabic")

model = AutoModelForSequenceClassification.from_pretrained("Hate-speech-CNERG/dehatebert-mono-arabic")

然后进行推理 文档 展示了一种从一个加载的变压器模型:

Then to proceed to inference the doc shows a way to get the output from a loaded transformer model:

inputs = tokenizer("صباح الخير", return_tensors="pt")
# We're not interested in labels here just the model's inference
# labels = torch.tensor([1]).unsqueeze(0)  # Batch size 1
outputs = model(**inputs) #, labels=labels)

# The model returns logits but we want a probability so we use the softmax function
probs = pytorch.softmax(outputs['logits'], 1)

这篇关于在 keras 中使用预训练的 Transformer的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆