斯坦福解析器和NLTK [英] Stanford Parser and NLTK
问题描述
是否可以在NLTK中使用Stanford Parser? (我不是在谈论斯坦福POS.)
Is it possible to use Stanford Parser in NLTK? (I am not talking about Stanford POS.)
推荐答案
请注意,此答案适用于NLTK v 3.0,不适用于更新的版本.
Note that this answer applies to NLTK v 3.0, and not to more recent versions.
当然,请在Python中尝试以下操作:
Sure, try the following in Python:
import os
from nltk.parse import stanford
os.environ['STANFORD_PARSER'] = '/path/to/standford/jars'
os.environ['STANFORD_MODELS'] = '/path/to/standford/jars'
parser = stanford.StanfordParser(model_path="/location/of/the/englishPCFG.ser.gz")
sentences = parser.raw_parse_sents(("Hello, My name is Melroy.", "What is your name?"))
print sentences
# GUI
for line in sentences:
for sentence in line:
sentence.draw()
输出:
[Tree('ROOT',[Tree('S',[Tree('INTJ',[Tree('UH',['Hello']])])), Tree(',',[',']),Tree('NP',[Tree('PRP $',['My'])),Tree('NN', ['name'])]),Tree('VP',[Tree('VBZ',['is'])),Tree('ADJP',[Tree('JJ', ['Melroy'])])))),Tree('.',['.'])])))),Tree('ROOT',[Tree('SBARQ', [Tree('WHNP',[Tree('WP',['What']))]),Tree('SQ',[Tree('VBZ', ['is']),Tree('NP',[Tree('PRP $',['your']),Tree('NN',['name'])]))))), Tree('.',['?'])]))))]]
[Tree('ROOT', [Tree('S', [Tree('INTJ', [Tree('UH', ['Hello'])]), Tree(',', [',']), Tree('NP', [Tree('PRP$', ['My']), Tree('NN', ['name'])]), Tree('VP', [Tree('VBZ', ['is']), Tree('ADJP', [Tree('JJ', ['Melroy'])])]), Tree('.', ['.'])])]), Tree('ROOT', [Tree('SBARQ', [Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ', ['is']), Tree('NP', [Tree('PRP$', ['your']), Tree('NN', ['name'])])]), Tree('.', ['?'])])])]
注释1: 在此示例中,解析器和模型罐子在同一文件夹中.
Note 1: In this example both the parser & model jars are in the same folder.
注释2:
- stanford解析器的文件名为:stanford-parser.jar
- stanford模型的文件名是:stanford-parser-x.x.x-models.jar
注意3: 可以在models.jar文件(/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz) 内部 中找到englishPCFG.ser.gz文件.请使用Come Archive Manager来解压缩" models.jar文件.
Note 3: The englishPCFG.ser.gz file can be found inside the models.jar file (/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz). Please use come archive manager to 'unzip' the models.jar file.
注释4: 确保您使用的是Java JRE(运行时环境) 1.8 (也称为Oracle JDK 8).否则,您将获得:不支持的major.minor 52.0版.
Note 4: Be sure you are using Java JRE (Runtime Environment) 1.8 also known as Oracle JDK 8. Otherwise you will get: Unsupported major.minor version 52.0.
-
从以下位置下载NLTK v3: https://github.com/nltk/nltk .并安装NLTK:
sudo python setup.py install
sudo python setup.py install
您可以使用NLTK下载程序通过Python获取Stanford Parser:
You can use the NLTK downloader to get Stanford Parser, using Python:
import nltk
nltk.download()
尝试我的例子! (不要忘记更改jar路径并将模型路径更改为ser.gz位置)
Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)
或:
-
下载并安装NLTK v3,与上面相同.
Download and install NLTK v3, same as above.
从(当前版本文件名是stanford-parser-full-2015-01-29.zip)下载最新版本: http://nlp.stanford.edu/software/lex-parser.shtml#Download
Download the latest version from (current version filename is stanford-parser-full-2015-01-29.zip): http://nlp.stanford.edu/software/lex-parser.shtml#Download
提取standford-parser-full-20xx-xx-xx.zip.
Extract the standford-parser-full-20xx-xx-xx.zip.
创建一个新文件夹(在我的示例中为"jars").将提取的文件放入以下jar文件夹中:stanford-parser-3.x.x-models.jar和stanford-parser.jar.
Create a new folder ('jars' in my example). Place the extracted files into this jar folder: stanford-parser-3.x.x-models.jar and stanford-parser.jar.
如上所示,您可以使用环境变量(STANFORD_PARSER& STANFORD_MODELS)指向此"jars"文件夹.我正在使用Linux,因此如果您使用Windows,请使用类似以下内容的文件:C://folder//jars.
As shown above you can use the environment variables (STANFORD_PARSER & STANFORD_MODELS) to point to this 'jars' folder. I'm using Linux, so if you use Windows please use something like: C://folder//jars.
使用存档管理器(7zip)打开stanford-parser-3.x.x-models.jar.
Open the stanford-parser-3.x.x-models.jar using an Archive manager (7zip).
浏览jar文件; edu/stanford/nlp/models/lexparser.再次提取名为"englishPCFG.ser.gz"的文件.记住提取此ser.gz文件的位置.
Browse inside the jar file; edu/stanford/nlp/models/lexparser. Again, extract the file called 'englishPCFG.ser.gz'. Remember the location where you extract this ser.gz file.
在创建StanfordParser实例时,可以提供模型路径作为参数.这是该模型的完整路径,在我们的示例中是/location/of/englishPCFG.ser.gz.
When creating a StanfordParser instance, you can provide the model path as parameter. This is the complete path to the model, in our case /location/of/englishPCFG.ser.gz.
尝试我的例子! (不要忘记更改jar路径并将模型路径更改为ser.gz位置)
Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)
这篇关于斯坦福解析器和NLTK的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!