Python:使用 Spacy 等对名词短语以外的其他短语(例如介词)进行分块 [英] Python: Chunking others than noun phrases (e.g. prepositional) using Spacy, etc

查看:100
本文介绍了Python:使用 Spacy 等对名词短语以外的其他短语(例如介词)进行分块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

自从有人告诉我 Spacy 是用于自然语音处理的如此强大的 Python 模块,我现在正在拼命寻找一种方法来将单词组合在一起,而不是名词短语,最重要的是,介词短语.我怀疑是否有一个 Spacy 函数,但我猜这将是最简单的方法(SpacySpaCy 导入已在我的项目中实现).尽管如此,我对短语识别/分块的任何可能性持开放态度.

Since I was told Spacy was such a powerful Python module for natural speech processing, I am now desperately looking for a way to group words together to more than noun phrases, most importantly, prepositional phrases. I doubt there is a Spacy function for this but that would be the easiest way I guess (SpacySpaCy import is already implemented in my project). Nevertheless, I'm open for any possibility of phrase recognition/ chunking.

推荐答案

这是获得 PP 的解决方案.通常,您可以使用 subtree 获取短语.

Here's a solution to get PPs. In general you can get phrases using subtree.

def get_pps(doc):
    "Function to get PPs from a parsed document."
    pps = []
    for token in doc:
        # Try this with other parts of speech for different subtrees.
        if token.pos_ == 'ADP':
            pp = ' '.join([tok.orth_ for tok in token.subtree])
            pps.append(pp)
    return pps

用法:

import spacy

nlp = spacy.load('en_core_web_sm')
ex = 'A short man in blue jeans is working in the kitchen.'
doc = nlp(ex)

print(get_pps(doc))

打印:

['in blue jeans', 'in the kitchen']

这篇关于Python:使用 Spacy 等对名词短语以外的其他短语(例如介词)进行分块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆