如何使用spacy/nltk生成bi/tri-gram [英] How to generate bi/tri-grams using spacy/nltk

查看：169 发布时间：2020/5/18 1:14:13 python nlp nltk n-gram spacy

本文介绍了如何使用spacy/nltk生成bi/tri-gram的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

输入的文本始终是菜名的列表，其中包含1〜3个形容词和一个名词

The input text are always list of dish names where there are 1~3 adjectives and a noun

输入

thai iced tea
spicy fried chicken
sweet chili pork
thai chicken curry

输出:

thai tea, iced tea
spicy chicken, fried chicken
sweet pork, chili pork
thai chicken, chicken curry, thai curry

基本上，我希望解析句子树，并尝试通过将形容词与名词配对来生成二元语法.

Basically, I am looking to parse the sentence tree and try to generate bi-grams by pairing an adjective with the noun.

我想通过spacy或nltk

And I would like to achieve this with spacy or nltk

推荐答案

我将spacy 2.0与英语模型一起使用.要找到名词和非名词"来解析输入，然后将非名词和名词放在一起以创建所需的输出.

I used spacy 2.0 with english model. To find nouns and "not-nouns" to parse the input and then I put together not-nouns and nouns to create a desired output.

您的输入:

s = ["thai iced tea",
"spicy fried chicken",
"sweet chili pork",
"thai chicken curry",]

Spacy解决方案:

Spacy solution:

import spacy
nlp = spacy.load('en') # import spacy, load model

def noun_notnoun(phrase):
    doc = nlp(phrase) # create spacy object
    token_not_noun = []
    notnoun_noun_list = []

    for item in doc:
        if item.pos_ != "NOUN": # separate nouns and not nouns
            token_not_noun.append(item.text)
        if item.pos_ == "NOUN":
            noun = item.text

    for notnoun in token_not_noun:
        notnoun_noun_list.append(notnoun + " " + noun)

    return notnoun_noun_list

通话功能:

for phrase in s:
    print(noun_notnoun(phrase))

结果:

['thai tea', 'iced tea']
['spicy chicken', 'fried chicken']
['sweet pork', 'chili pork']
['thai chicken', 'curry chicken']

这篇关于如何使用spacy/nltk生成bi/tri-gram的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用spacy/nltk生成bi/tri-gram [英] How to generate bi/tri-grams using spacy/nltk

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用spacy/nltk生成bi/tri-gram [英] How to generate bi/tri-grams using spacy/nltk

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭