Nltk stanford pos tagger 错误:Java 命令失败 [英] Nltk stanford pos tagger error : Java command failed
问题描述
我正在尝试使用 nltk.tag.stanford 模块
用于标记一个句子(首先像 wiki 的例子)但我不断收到以下错误:
回溯(最近一次调用最后一次):文件test.py",第 28 行,在 <module> 中print st.tag(word_tokenize('空载燕子的空速是多少?'))文件/usr/local/lib/python2.7/dist-packages/nltk/tag/stanford.py",第59行,标签中返回 self.tag_sents([tokens])[0]文件/usr/local/lib/python2.7/dist-packages/nltk/tag/stanford.py",第81行,在tag_sents标准输出=管道,标准错误=管道)文件/usr/local/lib/python2.7/dist-packages/nltk/internals.py",第160行,java引发 OSError('Java 命令失败!')OSError:Java 命令失败!
或以下 LookupError
错误:
查找错误:============================================================================NLTK 无法找到 java 文件!使用软件特定的配置参数或设置 JAVAHOME 环境变量.============================================================================
这是示例代码:
<预><代码>>>>从 nltk.tag.stanford 导入 POSTagger>>>st = POSTagger('/usr/share/stanford-postagger/models/english-bidirectional-dissim.tagger',...'/usr/share/stanford-postagger/stanford-postagger.jar')>>>st.tag('空载燕子的空速是多少?'.split())我也使用了 word_tokenize
而不是 split
但它没有任何区别.
我也重新安装了java或者jdk
!我所有的搜索都没有成功!类似于 nltknltk.internals.config_java()
或 ... !
注意:我使用的是 linux (Xubuntu)!
如果您通读了 nltk/internals.py(第 58 - 175 行)您应该很容易找到答案.NLTK 需要 Java 二进制文件的完整路径.
<块引用>如果未指定,则 nltk 将在系统中搜索 Java 二进制文件;如果没有找到,则会引发 LookupError 异常.
根据一些研究,我认为您有几个选择:
1) 将以下代码添加到您的项目中(不是很好的解决方案)
导入操作系统java_path = "path/to/java" # 替换这个os.environ['JAVAHOME'] = java_path
2) 卸载 &重新安装 NLTK(最好在 virtualenv 中)(更好但仍然不是很好)
pip 卸载 nltk须藤 -E pip install nltk
3) 设置java环境变量(这是IMO最实用的解决方案)
编辑系统路径文件/etc/profile
sudo gedit/etc/profile
在最后添加以下几行
JAVA_HOME=/usr/lib/jvm/jdk1.7.0PATH=$PATH:$HOME/bin:$JAVA_HOME/bin导出JAVA_HOME导出 JRE_HOME导出路径
I'm trying to use nltk.tag.stanford module
for tagging a sentence (first like wiki's example) but i keep getting the following error :
Traceback (most recent call last):
File "test.py", line 28, in <module>
print st.tag(word_tokenize('What is the airspeed of an unladen swallow ?'))
File "/usr/local/lib/python2.7/dist-packages/nltk/tag/stanford.py", line 59, in tag
return self.tag_sents([tokens])[0]
File "/usr/local/lib/python2.7/dist-packages/nltk/tag/stanford.py", line 81, in tag_sents
stdout=PIPE, stderr=PIPE)
File "/usr/local/lib/python2.7/dist-packages/nltk/internals.py", line 160, in java
raise OSError('Java command failed!')
OSError: Java command failed!
or following LookupError
error :
LookupError:
===========================================================================
NLTK was unable to find the java file!
Use software specific configuration paramaters or set the JAVAHOME environment variable.
===========================================================================
this is the exapmle code :
>>> from nltk.tag.stanford import POSTagger
>>> st = POSTagger('/usr/share/stanford-postagger/models/english-bidirectional-distsim.tagger',
... '/usr/share/stanford-postagger/stanford-postagger.jar')
>>> st.tag('What is the airspeed of an unladen swallow ?'.split())
I also used word_tokenize
instead split
but it doesn't made any difference.
I also installed java again or jdk
! and my all searches were unsuccessful! something like nltknltk.internals.config_java()
or ... !
Note : I use linux (Xubuntu)!
If you read through the embedded documentation in the nltk/internals.py (lines 58 - 175) you should find your answer easy enough. The NLTK requires the full path to the Java binary.
If not specified, then nltk will search the system for a Java binary; and if one is not found, it will raise a LookupError exception.
You have a couple of options I believe based on a bit of research:
1) Add the following code to your project (not a great solution)
import os
java_path = "path/to/java" # replace this
os.environ['JAVAHOME'] = java_path
2) Uninstall & Reinstall NLTK (preferably in a virtualenv) (better but still not great)
pip uninstall nltk
sudo -E pip install nltk
3) Set the java environment variable (This is the most pragmatic solution IMO)
Edit the system Path file /etc/profile
sudo gedit /etc/profile
Add following lines in end
JAVA_HOME=/usr/lib/jvm/jdk1.7.0
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
export JAVA_HOME
export JRE_HOME
export PATH
这篇关于Nltk stanford pos tagger 错误:Java 命令失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!