预测句子中的遗漏词-自然语言处理模型 [英] Predicting Missing Words in a sentence - Natural Language Processing Model

查看：149 发布时间：2020/5/4 9:02:38 machine-learning neural-network nlp predict

本文介绍了预测句子中的遗漏词-自然语言处理模型的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的句子如下:

I want to ____ the car because it is cheap.

我想使用NLP模型来预测丢失的单词.我应该使用哪种NLP模型?谢谢.

I want to predict the missing word ,using an NLP model. What NLP model shall I use? Thanks.

TL; DR

尝试一下: https://github.com/huggingface/pytorch-pretrained-BERT

首先，您必须正确设置

pip install -U pytorch-pretrained-bert

然后，您可以使用BERT算法中的屏蔽语言模型"，例如

Then you can use the "masked language model" from the BERT algorithm, e.g.

import torch
from pytorch_pretrained_bert import BertTokenizer, BertModel, BertForMaskedLM

# OPTIONAL: if you want to have more information on what's happening, activate the logger as follows
import logging
logging.basicConfig(level=logging.INFO)

# Load pre-trained model tokenizer (vocabulary)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

text = '[CLS] I want to [MASK] the car because it is cheap . [SEP]'
tokenized_text = tokenizer.tokenize(text)
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)

# Create the segments tensors.
segments_ids = [0] * len(tokenized_text)

# Convert inputs to PyTorch tensors
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])

# Load pre-trained model (weights)
model = BertForMaskedLM.from_pretrained('bert-base-uncased')
model.eval()

# Predict all tokens
with torch.no_grad():
    predictions = model(tokens_tensor, segments_tensors)

predicted_index = torch.argmax(predictions[0, masked_index]).item()
predicted_token = tokenizer.convert_ids_to_tokens([predicted_index])[0]

print(predicted_token)

[输出]:

buy

长话

要真正理解为什么需要[CLS]，[MASK]和分段张量，请仔细阅读本文， https://arxiv.org/abs/1810.04805

In Long

To truly understand why you need the [CLS], [MASK] and segment tensors, please do read the paper carefully, https://arxiv.org/abs/1810.04805

如果您很懒惰，可以阅读Lilian Weng的这篇不错的博客文章，

And if you're lazy, you can read this nice blogpost from Lilian Weng, https://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html

除BERT之外，还有许多其他模型可以执行填补空白的任务.请查看pytorch-pretrained-BERT存储库中的其他模型，但更重要的是，应更深入地研究语言建模"的任务，即根据历史记录预测下一个单词的任务.

Other than BERT, there are a lot of other models that can perform the task of filling in the blank. Do look at the other models in the pytorch-pretrained-BERT repository, but more importantly dive deeper into the task of "Language Modeling", i.e. the task of predicting the next word given a history.

这篇关于预测句子中的遗漏词-自然语言处理模型的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

预测句子中的遗漏词-自然语言处理模型 [英] Predicting Missing Words in a sentence - Natural Language Processing Model

问题描述

推荐答案

TL; DR

长话

In Long

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

预测句子中的遗漏词-自然语言处理模型 [英] Predicting Missing Words in a sentence - Natural Language Processing Model

问题描述

推荐答案

TL; DR

长话

In Long

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭