如何使用 Python 在 NLTK 中使用斯坦福解析器 [英] How to use Stanford Parser in NLTK using Python

查看:122
本文介绍了如何使用 Python 在 NLTK 中使用斯坦福解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以在 NLTK 中使用斯坦福解析器?(我不是在谈论斯坦福 POS.)

Is it possible to use Stanford Parser in NLTK? (I am not talking about Stanford POS.)

推荐答案

请注意,此答案适用于 NLTK v 3.0,而不适用于更新的版本.

Note that this answer applies to NLTK v 3.0, and not to more recent versions.

当然,请在 Python 中尝试以下操作:

Sure, try the following in Python:

import os
from nltk.parse import stanford
os.environ['STANFORD_PARSER'] = '/path/to/standford/jars'
os.environ['STANFORD_MODELS'] = '/path/to/standford/jars'

parser = stanford.StanfordParser(model_path="/location/of/the/englishPCFG.ser.gz")
sentences = parser.raw_parse_sents(("Hello, My name is Melroy.", "What is your name?"))
print sentences

# GUI
for line in sentences:
    for sentence in line:
        sentence.draw()

输出:

[Tree('ROOT', [Tree('S', [Tree('INTJ', [Tree('UH', ['Hello'])]),Tree(',', [',']), Tree('NP', [Tree('PRP$', ['My']), Tree('NN',['name'])]), Tree('VP', [Tree('VBZ', ['is']), Tree('ADJP', [Tree('JJ',['Melroy'])])]), Tree('.', ['.'])])]), Tree('ROOT', [Tree('SBARQ',[Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ',['is']), Tree('NP', [Tree('PRP$', ['your']), Tree('NN', ['name'])])]),Tree('.', ['?'])])])]

[Tree('ROOT', [Tree('S', [Tree('INTJ', [Tree('UH', ['Hello'])]), Tree(',', [',']), Tree('NP', [Tree('PRP$', ['My']), Tree('NN', ['name'])]), Tree('VP', [Tree('VBZ', ['is']), Tree('ADJP', [Tree('JJ', ['Melroy'])])]), Tree('.', ['.'])])]), Tree('ROOT', [Tree('SBARQ', [Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ', ['is']), Tree('NP', [Tree('PRP$', ['your']), Tree('NN', ['name'])])]), Tree('.', ['?'])])])]

注意 1:在这个例子中,解析器和模型罐位于同一文件夹中.

Note 1: In this example both the parser & model jars are in the same folder.

注意 2:

  • stanford 解析器的文件名为:stanford-parser.jar
  • 斯坦福模型的文件名是:stanford-parser-x.x.x-models.jar

注意 3:englishPCFG.ser.gz 文件可以在里面models.jar 文件 (/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz) 中找到.请使用存档管理器来解压缩"models.jar 文件.

Note 3: The englishPCFG.ser.gz file can be found inside the models.jar file (/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz). Please use come archive manager to 'unzip' the models.jar file.

注意 4:确保您使用的是 Java JRE(运行时环境)1.8,也称为 Oracle JDK 8.否则您将获得:不支持的主要版本.次要版本 52.0.

Note 4: Be sure you are using Java JRE (Runtime Environment) 1.8 also known as Oracle JDK 8. Otherwise you will get: Unsupported major.minor version 52.0.

  1. 从以下位置下载 NLTK v3:https://github.com/nltk/nltk.并安装 NLTK:

sudo python setup.py install

sudo python setup.py install

您可以使用 NLTK 下载器获取斯坦福解析器,使用 Python:

You can use the NLTK downloader to get Stanford Parser, using Python:

import nltk
nltk.download()

  • 试试我的例子!(不要忘记更改jar路径并将模型路径更改为ser.gz位置)

  • Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)

    或:

    1. 下载并安装 NLTK v3,同上.

    1. Download and install NLTK v3, same as above.

    从(当前版本文件名是 stanford-parser-full-2015-01-29.zip)下载最新版本:http://nlp.stanford.edu/software/lex-parser.shtml#Download

    Download the latest version from (current version filename is stanford-parser-full-2015-01-29.zip): http://nlp.stanford.edu/software/lex-parser.shtml#Download

    解压standford-parser-full-20xx-xx-xx.zip.

    Extract the standford-parser-full-20xx-xx-xx.zip.

    创建一个新文件夹(在我的示例中为jars").将解压出来的文件放入这个 jar 文件夹:stanford-parser-3.x.x-models.jar 和 stanford-parser.jar.

    Create a new folder ('jars' in my example). Place the extracted files into this jar folder: stanford-parser-3.x.x-models.jar and stanford-parser.jar.

    如上所示,您可以使用环境变量 (STANFORD_PARSER & STANFORD_MODELS) 来指向这个 'jars' 文件夹.我使用的是 Linux,所以如果你使用 Windows,请使用类似:C://folder//jars.

    As shown above you can use the environment variables (STANFORD_PARSER & STANFORD_MODELS) to point to this 'jars' folder. I'm using Linux, so if you use Windows please use something like: C://folder//jars.

    使用存档管理器 (7zip) 打开 stanford-parser-3.x.x-models.jar.

    Open the stanford-parser-3.x.x-models.jar using an Archive manager (7zip).

    浏览jar文件;edu/stanford/nlp/models/lexparser.再次提取名为englishPCFG.ser.gz"的文件.记住您提取此 ser.gz 文件的位置.

    Browse inside the jar file; edu/stanford/nlp/models/lexparser. Again, extract the file called 'englishPCFG.ser.gz'. Remember the location where you extract this ser.gz file.

    创建 StanfordParser 实例时,您可以提供模型路径作为参数.这是模型的完整路径,在我们的例子中是/location/of/englishPCFG.ser.gz.

    When creating a StanfordParser instance, you can provide the model path as parameter. This is the complete path to the model, in our case /location/of/englishPCFG.ser.gz.

    试试我的例子!(不要忘记更改jar路径并将模型路径更改为ser.gz位置)

    Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)

    这篇关于如何使用 Python 在 NLTK 中使用斯坦福解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆