NLTK:如何遍历名词短语以返回字符串列表? [英] NLTK: How do I traverse a noun phrase to return list of strings?

查看：201 发布时间：2020/5/18 1:14:48 python parsing recursion nltk traversal

本文介绍了NLTK:如何遍历名词短语以返回字符串列表?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在NLTK中，如何遍历解析的句子以返回名词短语字符串列表?

In NLTK, how do I traverse a parsed sentence to return a list of noun phrase strings?

我有两个目标:
(1)创建名词短语列表，而不是使用"traverse()"方法打印它们.我目前使用StringIO记录现有traverse()方法的输出.这不是可接受的解决方案.
(2)取消解析名词短语字符串，这样:'(NP Michael/NNP Jackson/NNP)'变成'Michael Jackson'. NLTK中是否有一种可以解析的方法?

I have two goals:
(1) Create the list of Noun Phrases instead of printing them using the 'traverse()' method. I presently use StringIO to record the output of the existing traverse() method. That is not an acceptable solution.
(2) De-parse the Noun Phrase string so: '(NP Michael/NNP Jackson/NNP)' becomes 'Michael Jackson'. Is there a method in NLTK to de-parse?

NLTK文档建议使用traverse()查看名词短语，但是如何在此递归方法中捕获"t"，以便生成字符串名词短语列表?

The NLTK documentation recommends using traverse() to view the Noun Phrase, but how do I capture the 't' in this recursive method so I generate a list of string Noun Phrases?

from nltk.tag import pos_tag

def traverse(t):
  try:
      t.label()
  except AttributeError:
      return
  else:
      if t.label() == 'NP': print(t)  # or do something else
      else:
          for child in t: 
              traverse(child)

def nounPhrase(tagged_sent):
    # Tag sentence for part of speech
    tagged_sent = pos_tag(sentence.split())  # List of tuples with [(Word, PartOfSpeech)]
    # Define several tag patterns
    grammar = r"""
      NP: {<DT|PP\$>?<JJ>*<NN>}   # chunk determiner/possessive, adjectives and noun
      {<NNP>+}                # chunk sequences of proper nouns
      {<NN>+}                 # chunk consecutive nouns
      """
    cp = nltk.RegexpParser(grammar)  # Define Parser
    SentenceTree = cp.parse(tagged_sent)
    NounPhrases = traverse(SentenceTree)   # collect Noun Phrase
    return(NounPhrases)

sentence = "Michael Jackson likes to eat at McDonalds"
tagged_sent = pos_tag(sentence.split())  
NP = nounPhrase(tagged_sent)  
print(NP)

当前打印:
(NP Michael/NNP Jackson/NNP)
(NP McDonalds/NNP)
并将无"存储到NP

This presently prints:
(NP Michael/NNP Jackson/NNP)
(NP McDonalds/NNP)
and stores 'None' to NP

NLTK:如何遍历名词短语以返回字符串列表? [英] NLTK: How do I traverse a noun phrase to return list of strings?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

NLTK:如何遍历名词短语以返回字符串列表? [英] NLTK: How do I traverse a noun phrase to return list of strings?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭