给定一个词,我们可以使用 Spacy 得到它所有可能的引理吗? [英] Given a word can we get all possible lemmas for it using Spacy?

查看:22
本文介绍了给定一个词,我们可以使用 Spacy 得到它所有可能的引理吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

输入词是独立的,不是句子的一部分,但我想获得它所有可能的引理,就好像输入词在不同的句子中一样,带有所有可能的词性标签.我也想得到单词引理的查找版本.

The input word is standalone and not part of a sentence but I would like to get all of its possible lemmas as if the input word were in different sentences with all possible POS tags. I would also like to get the lookup version of the word's lemma.

我为什么要这样做?

我已经从所有文档中提取了引理,并且还计算了引理之间的依赖链接的数量.我已经使用 en_core_web_sm 完成了这两项工作.现在,给定一个输入词,我想返回与输入词的所有可能引理链接最频繁的引理.

I have extracted lemmas from all the documents and I have also calculated the number of dependency links between lemmas. Both of which I have done using en_core_web_sm. Now, given an input word, I would like to return the lemmas that are linked most frequently to all the possible lemmas of the input word.

简而言之,我想为输入词复制 token._lemma 的行为,并带有所有可能的 POS 标签,以保持与我计算过的引理链接的一致性.

So in short, I would like to replicate the behaviour of token._lemma for the input word with all possible POS tags to maintain consistency with the lemma links I have counted.

推荐答案

我发现如果不先构建一个例句为其提供上下文,就很难直接从 spaCy 中获得引理和屈折变化.这并不理想,所以我进一步查看并发现 LemmaInflect 做得很好.

I found it difficult to get lemmas and inflections directly out of spaCy without first constructing an example sentence to give it context. This wasn't ideal, so I looked further and found LemmaInflect did this very well.

> from lemminflect import getInflection, getAllInflections, getAllInflectionsOOV

> getAllLemmas('watches')
{'NOUN': ('watch',), 'VERB': ('watch',)}

> getAllInflections('watch')
{'NN': ('watch',), 'NNS': ('watches', 'watch'), 'VB': ('watch',), 'VBD': ('watched',), 'VBG': ('watching',), 'VBZ': ('watches',),  'VBP': ('watch',)}

这篇关于给定一个词,我们可以使用 Spacy 得到它所有可能的引理吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆