Python Nltk中的Stanford实体识别器(无用) [英] Stanford Entity Recognizer (caseless) in Python Nltk

查看：120 发布时间：2020/5/18 1:24:55 python nlp nltk

本文介绍了Python Nltk中的Stanford实体识别器(无用)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试弄清如何使用NLTK的无识别符版本的实体识别器.我下载了 http://nlp.stanford.edu/software/stanford -ner-2015-04-20.zip 并将其放置在python的site-packages文件夹中.然后，我下载了 http://nlp.stanford. edu/software/stanford-corenlp-caseless-2015-04-20-models.jar 并将其放在文件夹中.然后我在NLTK中运行了这段代码

I am trying to figure out how to use the caseless version of the entity recognizer from NLTK. I downloaded http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zip and placed it in the site-packages folder of python. Then I downloaded http://nlp.stanford.edu/software/stanford-corenlp-caseless-2015-04-20-models.jar and placed it in the folder. Then I ran this code in NLTK

from nltk.tag.stanford import NERTagger
english_nertagger = NERTagger(‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/classifiers/english.conll.4class.distsim.crf.ser.gz’, ‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/stanford-corenlp-caseless-2015-04-20-models.jar’)

但是当我运行它时:

english_nertagger.tag(‘Rami Eid is studying at stony brook university in NY’.split())

我得到一个错误:

Error: Could not find or load main class edu.stanford.nlp.ie.crf.CRFClassifier

任何有经验的帮助都将受到赞赏！

Any help if you have experience is appreciated!

P.S.我可以使非不区分大小写的版本正常工作，但我发现在分析搜索查询时，用户几乎不会将单词大写，并且非不区分大小写的版本似乎完全遗漏了单词.

P.S. I can get the non-caseless version working fine but I find that when analysing search queries, users hardly ever capitalize words and the non-caseless version appears to completely miss words if they are not capitalized.

推荐答案

StanfordNERTagger的第二个参数是斯坦福标记jar文件的路径，而不是模型的路径.因此，将其更改为stanford-ner.jar(当然，并将其放置在此处).

The second parameter of StanfordNERTagger is the path to the stanford tagger jar file, not the path to the model. So, change it to stanford-ner.jar (and place it there, of course).

另外，您似乎应该选择english.conll.4class.caseless.distsim.crf.ser.gz(来自stanford-corenlp-caseless-2015-04-20-models.jar)而不是english.conll.4class.distsim.crf.ser.gz

Also it seems that you should choose english.conll.4class.caseless.distsim.crf.ser.gz (from stanford-corenlp-caseless-2015-04-20-models.jar) instead of english.conll.4class.distsim.crf.ser.gz

因此请尝试以下操作:

 english_nertagger = StanfordNERTagger(‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/classifiers/english.conll.4class.caseless.distsim.crf.ser.gz’, ‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/stanford-ner.jar’)

已更新. NERTagger已重命名为StanfordNERTagger

Upd. NERTagger has been renamed to StanfordNERTagger

这篇关于Python Nltk中的Stanford实体识别器(无用)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python Nltk中的Stanford实体识别器(无用) [英] Stanford Entity Recognizer (caseless) in Python Nltk

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Nltk中的Stanford实体识别器(无用) [英] Stanford Entity Recognizer (caseless) in Python Nltk

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭