斯坦福解析器和NLTK [英] Stanford Parser and NLTK

查看:215
本文介绍了斯坦福解析器和NLTK的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以在NLTK中使用Stanford Parser? (我不是在谈论斯坦福POS.)

Is it possible to use Stanford Parser in NLTK? (I am not talking about Stanford POS.)

推荐答案

请注意,此答案适用于NLTK v 3.0,不适用于更新的版本.

Note that this answer applies to NLTK v 3.0, and not to more recent versions.

当然,请在Python中尝试以下操作:

Sure, try the following in Python:

import os
from nltk.parse import stanford
os.environ['STANFORD_PARSER'] = '/path/to/standford/jars'
os.environ['STANFORD_MODELS'] = '/path/to/standford/jars'

parser = stanford.StanfordParser(model_path="/location/of/the/englishPCFG.ser.gz")
sentences = parser.raw_parse_sents(("Hello, My name is Melroy.", "What is your name?"))
print sentences

# GUI
for line in sentences:
    for sentence in line:
        sentence.draw()

输出:

[Tree('ROOT',[Tree('S',[Tree('INTJ',[Tree('UH',['Hello']])])), Tree(',',[',']),Tree('NP',[Tree('PRP $',['My'])),Tree('NN', ['name'])]),Tree('VP',[Tree('VBZ',['is'])),Tree('ADJP',[Tree('JJ', ['Melroy'])])))),Tree('.',['.'])])))),Tree('ROOT',[Tree('SBARQ', [Tree('WHNP',[Tree('WP',['What']))]),Tree('SQ',[Tree('VBZ', ['is']),Tree('NP',[Tree('PRP $',['your']),Tree('NN',['name'])]))))), Tree('.',['?'])]))))]]

[Tree('ROOT', [Tree('S', [Tree('INTJ', [Tree('UH', ['Hello'])]), Tree(',', [',']), Tree('NP', [Tree('PRP$', ['My']), Tree('NN', ['name'])]), Tree('VP', [Tree('VBZ', ['is']), Tree('ADJP', [Tree('JJ', ['Melroy'])])]), Tree('.', ['.'])])]), Tree('ROOT', [Tree('SBARQ', [Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ', ['is']), Tree('NP', [Tree('PRP$', ['your']), Tree('NN', ['name'])])]), Tree('.', ['?'])])])]

注释1: 在此示例中,解析器和模型罐子在同一文件夹中.

Note 1: In this example both the parser & model jars are in the same folder.

注释2:

  • stanford解析器的文件名为:stanford-parser.jar
  • stanford模型的文件名是:stanford-parser-x.x.x-models.jar

注意3: 可以在models.jar文件(/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz) 内部 中找到englishPCFG.ser.gz文件.请使用Come Archive Manager来解压缩" models.jar文件.

Note 3: The englishPCFG.ser.gz file can be found inside the models.jar file (/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz). Please use come archive manager to 'unzip' the models.jar file.

注释4: 确保您使用的是Java JRE(运行时环境) 1.8 (也称为Oracle JDK 8).否则,您将获得:不支持的major.minor 52.0版.

Note 4: Be sure you are using Java JRE (Runtime Environment) 1.8 also known as Oracle JDK 8. Otherwise you will get: Unsupported major.minor version 52.0.

  1. 从以下位置下载NLTK v3: https://github.com/nltk/nltk .并安装NLTK:

sudo python setup.py install

sudo python setup.py install

您可以使用NLTK下载程序通过Python获取Stanford Parser:

You can use the NLTK downloader to get Stanford Parser, using Python:

import nltk
nltk.download()

  • 尝试我的例子! (不要忘记更改jar路径并将模型路径更改为ser.gz位置)

  • Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)

    或:

    1. 下载并安装NLTK v3,与上面相同.

    1. Download and install NLTK v3, same as above.

    从(当前版本文件名是stanford-parser-full-2015-01-29.zip)下载最新版本: http://nlp.stanford.edu/software/lex-parser.shtml#Download

    Download the latest version from (current version filename is stanford-parser-full-2015-01-29.zip): http://nlp.stanford.edu/software/lex-parser.shtml#Download

    提取standford-parser-full-20xx-xx-xx.zip.

    Extract the standford-parser-full-20xx-xx-xx.zip.

    创建一个新文件夹(在我的示例中为"jars").将提取的文件放入以下jar文件夹中:stanford-parser-3.x.x-models.jar和stanford-parser.jar.

    Create a new folder ('jars' in my example). Place the extracted files into this jar folder: stanford-parser-3.x.x-models.jar and stanford-parser.jar.

    如上所示,您可以使用环境变量(STANFORD_PARSER& STANFORD_MODELS)指向此"jars"文件夹.我正在使用Linux,因此如果您使用Windows,请使用类似以下内容的文件:C://folder//jars.

    As shown above you can use the environment variables (STANFORD_PARSER & STANFORD_MODELS) to point to this 'jars' folder. I'm using Linux, so if you use Windows please use something like: C://folder//jars.

    使用存档管理器(7zip)打开stanford-parser-3.x.x-models.jar.

    Open the stanford-parser-3.x.x-models.jar using an Archive manager (7zip).

    浏览jar文件; edu/stanford/nlp/models/lexparser.再次提取名为"englishPCFG.ser.gz"的文件.记住提取此ser.gz文件的位置.

    Browse inside the jar file; edu/stanford/nlp/models/lexparser. Again, extract the file called 'englishPCFG.ser.gz'. Remember the location where you extract this ser.gz file.

    在创建StanfordParser实例时,可以提供模型路径作为参数.这是该模型的完整路径,在我们的示例中是/location/of/englishPCFG.ser.gz.

    When creating a StanfordParser instance, you can provide the model path as parameter. This is the complete path to the model, in our case /location/of/englishPCFG.ser.gz.

    尝试我的例子! (不要忘记更改jar路径并将模型路径更改为ser.gz位置)

    Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)

    这篇关于斯坦福解析器和NLTK的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆