NLTK:语料库级BLEU与句子级BLEU分数 [英] NLTK: corpus-level bleu vs sentence-level BLEU score

查看：1254 发布时间：2020/5/4 9:01:00 python machine-learning nlp nltk bleu

本文介绍了NLTK:语料库级BLEU与句子级BLEU分数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经在python中导入了nltk以在Ubuntu上计算BLEU分数.我了解句子级BLEU评分的工作原理，但不了解语料库级BLEU评分的工作原理.

I have imported nltk in python to calculate BLEU Score on Ubuntu. I understand how sentence-level BLEU score works, but I don't understand how corpus-level BLEU score work.

以下是我的语料库级BLEU分数代码:

Below is my code for corpus-level BLEU score:

import nltk

hypothesis = ['This', 'is', 'cat'] 
reference = ['This', 'is', 'a', 'cat']
BLEUscore = nltk.translate.bleu_score.corpus_bleu([reference], [hypothesis], weights = [1])
print(BLEUscore)

由于某些原因，上述代码的bleu得分为0.我期望语料库水平的BLEU分数至少为0.5.

For some reason, the bleu score is 0 for the above code. I was expecting a corpus-level BLEU score of at least 0.5.

这是我的句子级BLEU得分代码

Here is my code for sentence-level BLEU score

import nltk

hypothesis = ['This', 'is', 'cat'] 
reference = ['This', 'is', 'a', 'cat']
BLEUscore = nltk.translate.bleu_score.sentence_bleu([reference], hypothesis, weights = [1])
print(BLEUscore)

在这里，考虑到简短惩罚和缺失的单词"a"，我希望句子的BLEU得分是0.71.但是，我不了解语料库级BLEU评分的工作原理.

Here the sentence-level BLEU score is 0.71 which I expect, taking into account the brevity-penalty and the missing word "a". However, I don't understand how corpus-level BLEU score work.

任何帮助将不胜感激.

推荐答案

TL; DR :

>>> import nltk
>>> hypothesis = ['This', 'is', 'cat'] 
>>> reference = ['This', 'is', 'a', 'cat']
>>> references = [reference] # list of references for 1 sentence.
>>> list_of_references = [references] # list of references for all sentences in corpus.
>>> list_of_hypotheses = [hypothesis] # list of hypotheses that corresponds to list of references.
>>> nltk.translate.bleu_score.corpus_bleu(list_of_references, list_of_hypotheses)
0.6025286104785453
>>> nltk.translate.bleu_score.sentence_bleu(references, hypothesis)
0.6025286104785453

(注意:您必须在develop分支上获取最新版本的NLTK，才能获得稳定版本的BLEU评分实现)

(Note: You have to pull the latest version of NLTK on the develop branch in order to get a stable version of the BLEU score implementation)

详细:

实际上，如果整个语料库中只有一个参考和一个假设，则corpus_bleu()和sentence_bleu()都应返回与上例所示相同的值.

Actually, if there's only one reference and one hypothesis in your whole corpus, both corpus_bleu() and sentence_bleu() should return the same value as shown in the example above.

在代码中，我们看到 sentence_bleu实际上是corpus_bleu 的鸭子类型:

In the code, we see that sentence_bleu is actually a duck-type of corpus_bleu:

def sentence_bleu(references, hypothesis, weights=(0.25, 0.25, 0.25, 0.25),
                  smoothing_function=None):
    return corpus_bleu([references], [hypothesis], weights, smoothing_function)

如果我们查看sentence_bleu的参数:

 def sentence_bleu(references, hypothesis, weights=(0.25, 0.25, 0.25, 0.25),
                      smoothing_function=None):
    """"
    :param references: reference sentences
    :type references: list(list(str))
    :param hypothesis: a hypothesis sentence
    :type hypothesis: list(str)
    :param weights: weights for unigrams, bigrams, trigrams and so on
    :type weights: list(float)
    :return: The sentence-level BLEU score.
    :rtype: float
    """

sentence_bleu的引用输入是list(list(str)).

因此，如果您有一个句子字符串，例如"This is a cat"，您必须对其进行标记以获取字符串列表["This", "is", "a", "cat"]，并且由于它允许多个引用，因此它必须是字符串列表的列表，例如如果您还有第二个参考文献这是猫科动物"，则对sentence_bleu()的输入将为:

So if you have a sentence string, e.g. "This is a cat", you have to tokenized it to get a list of strings, ["This", "is", "a", "cat"] and since it allows for multiple references, it has to be a list of list of string, e.g. if you have a second reference, "This is a feline", your input to sentence_bleu() would be:

references = [ ["This", "is", "a", "cat"], ["This", "is", "a", "feline"] ]
hypothesis = ["This", "is", "cat"]
sentence_bleu(references, hypothesis)

当涉及到corpus_bleu() list_of_references参数时，它基本上是

When it comes to corpus_bleu() list_of_references parameter, it's basically a list of whatever the sentence_bleu() takes as references:

def corpus_bleu(list_of_references, hypotheses, weights=(0.25, 0.25, 0.25, 0.25),
                smoothing_function=None):
    """
    :param references: a corpus of lists of reference sentences, w.r.t. hypotheses
    :type references: list(list(list(str)))
    :param hypotheses: a list of hypothesis sentences
    :type hypotheses: list(list(str))
    :param weights: weights for unigrams, bigrams, trigrams and so on
    :type weights: list(float)
    :return: The corpus-level BLEU score.
    :rtype: float
    """

除了查看 nltk/translate/bleu_score.py ，您还可以在

Other than look at the doctest within the nltk/translate/bleu_score.py, you can also take a look at the unittest at nltk/test/unit/translate/test_bleu_score.py to see how to use each of the component within the bleu_score.py.

顺便说一句，因为sentence_bleu在(nltk.translate.__init__.py](

By the way, since the sentence_bleu is imported as bleu in the (nltk.translate.__init__.py](https://github.com/nltk/nltk/blob/develop/nltk/translate/init.py#L21), using

from nltk.translate import bleu

将与以下相同:

from nltk.translate.bleu_score import sentence_bleu

和代码:

>>> from nltk.translate import bleu
>>> from nltk.translate.bleu_score import sentence_bleu
>>> from nltk.translate.bleu_score import corpus_bleu
>>> bleu == sentence_bleu
True
>>> bleu == corpus_bleu
False

这篇关于NLTK:语料库级BLEU与句子级BLEU分数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

NLTK:语料库级BLEU与句子级BLEU分数 [英] NLTK: corpus-level bleu vs sentence-level BLEU score

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

NLTK:语料库级BLEU与句子级BLEU分数 [英] NLTK: corpus-level bleu vs sentence-level BLEU score

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭