Spacy：词汇中的单词 [英] spaCy: Word in vocabulary

查看：10 发布时间：2022/5/15 18:48:08 spacy vocabulary

本文介绍了Spacy：词汇中的单词的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我尝试用空格纠正打字错误，为此，我需要知道单词中是否存在单词。如果没有，这个想法是把这个词一分为二，直到所有的片段都存在。例如，"of the"不存在，"of"和"the"则不存在。因此，我首先需要知道单词中是否存在某个单词。这就是问题的起点。我尝试：

for token in nlp("apple"):
    print(token.lemma_, token.lemma, token.is_oov, "apple" in nlp.vocab)
apple 8566208034543834098 True True

for token in nlp("andshy"):
    print(token.lemma_, token.lemma, token.is_oov, "andshy" in nlp.vocab)
andshy 4682930577439079723 True True

很明显，这没有任何意义，在这两种情况下，"is_OOV"都是True，而且它在词汇表中。我在找一些简单的东西，比如

"andshy" in nlp.vocab = False, "andshy".is_oov = True
"apple" in nlp.vocab = True, "apple".is_oov = False

并在下一步中，还介绍了一些文字纠正的方法。我可以使用拼写检查库，但这与拼写单词不一致

这个问题似乎是一个常见的问题，欢迎提出任何建议(代码)。

谢谢，

Ahe

Spacy：词汇中的单词 [英] spaCy: Word in vocabulary

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spacy：词汇中的单词 [英] spaCy: Word in vocabulary

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭