获取副词和形容词的对应动词和名词 [英] Get corresponding verbs and nouns for adverbs and adjectives

查看:112
本文介绍了获取副词和形容词的对应动词和名词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在python中获得副词和形容词的对应动词和名词?似乎简单的继承和优先顺序可能不太准确.可能有一些停用词,例如.我很高兴学习...

How can I get corresponding verbs and nouns for adverbs and adjectives in python? It seems simple succession and precedence may not be very accurate. There may be stopwords like to eg. in I am delighted to learn...

我无法正式定义任何图书馆或问题陈述.

I can't any library or even problem statement formalised as such.

现在输入代码.现在,我想为句子中的每个形容词返回相应的副词动词和名词. 请帮忙.

Code right now. Now I want to return the corresponding the verb for adverb and noun for each adjective in the sentence. Please help.

Code:
def pos_func(input_text):
    #pos tagging code:
    text=input_text
    tokens=tokenize_words(text)
    tagged=pos_tag(tokens)
    pos_store(tagged)

def pos_store(tagged):
    verbs=[]
    adjectives=[]
    adverbs=[]
    nouns=[]
    for tag in tagged:
        pos=tag[1]
        if pos[0]=='V':
            verbs.append(tag[0])
        elif pos[0]=='N':
            nouns.append(tag[0])
        elif pos[0]=='J':
            adjectives.append(tag[0])
        elif pos[0:2]=='RB':
            adverbs.append(tag[0])


def tokenize_words(text):
    tokens = TreebankWordTokenizer().tokenize(text)
    contractions = ["n't", "'ll", "'m"]
    fix = []
    for i in range(len(tokens)):
        for c in contractions:
            if tokens[i] == c: fix.append(i)
    fix_offset = 0
    for fix_id in fix:
        idx = fix_id - 1 - fix_offset
        tokens[idx] = tokens[idx] + tokens[idx+1]
        del tokens[idx+1]
        fix_offset += 1
    return tokens

推荐答案

您要解决的一般问题称为依赖项解析.要提取单词之间的这种关系,您需要的只是一个简单的POS标记分析所提供的线性单词序列.考虑以下句子:

The general problem you are trying to solve is called Dependency Parsing. To extract such relations between words you need more then just the linear sequence of words that a simple POS tagging analysis offers. Consider the following sentence:

他买了一辆漂亮又快的汽车."您将提取(美丽的汽车)和(快速的汽车).您面临的问题不仅仅是过滤名词和副词之间的停用词.使用分析树分析可以使您更好地理解为什么不能使用单词序列来解决这个问题.

"He bought a beautiful and fast car." You would extract (beautiful, car) and (fast, car). You face a greater problem than just filtering stop words between a Noun and an Adverb. Using a parse tree analysis will give you a better idea of why this is not something you can solve using the word sequence.

这是我们句子的分析树:

This is the parse tree for our sentence:

(ROOT
  (S
    (NP (PRP He))
    (VP (VBD bought)
      (NP (DT a)
        (ADJP (JJ beautiful)
          (CC and)
          (JJ fast))
        (NN car)))
    (. .)))

如您所见,美丽而快速的汽车"是一个名词短语(NP),其中包含一个确定者(DT),AdjectivalPhrase(ADJP,美丽而快速")和名词(NN,汽车").一段时间以来使用的一种方法是创建一个基于规则的系统,该系统从该解析树中提取对.幸运的是,已经开发出了更好的解决方案,可以直接解决您的问题.

As you can see "a beautiful and fast car" is a NounPhrase (NP) containing a Determiner(DT), and AdjectivalPhrase(ADJP, "beautiful and fast") and Noun(NN, "car"). One approach that was used for some time was to create a rule based system that extracted the pairs from this parse tree. Fortunately, something even better has been developed that addresses your problem directly.

依赖性对为:

nsubj(bought-2, He-1)
root(ROOT-0, bought-2)
det(car-7, a-3)
amod(car-7, beautiful-4)
cc(beautiful-4, and-5)
conj:and(beautiful-4, fast-6)
amod(car-7, fast-6)
dobj(bought-2, car-7)

如您所见,这正是您所需要的.这些是类型化的依赖项,因此您还需要过滤感兴趣的依赖项(在您的情况下为 amod advmod )

As you can see this is exactly what you need. These are typed dependencies, so you'll also need to filter the ones you are interested in(amod, advmod in your case)

您可以在此处找到依赖项类型的完整列表: http://nlp.stanford.edu /software/dependencies_manual.pdf Stanford Parser演示在这里: http://nlp.stanford.edu:8080/parser/ Stanford Core NLP演示(用于炫酷的可视化)在这里: http://nlp.stanford.edu:8080/corenlp/

You can find the full list of dependency types here: http://nlp.stanford.edu/software/dependencies_manual.pdf Stanford Parser Demo here: http://nlp.stanford.edu:8080/parser/ Stanford Core NLP Demo(for the cool visualisations) here: http://nlp.stanford.edu:8080/corenlp/

您可以在此处阅读有关在Python中创建依赖项解析器的精彩文章(尽管您将需要培训数据):

You can read a great article about creating a dependency parser in Python here (you will need training data though): https://honnibal.wordpress.com/2013/12/18/a-simple-fast-algorithm-for-natural-language-dependency-parsing/

CoreNLP的Python接口: https://github.com/dasmith/stanford-corenlp-python

Python interface to CoreNLP: https://github.com/dasmith/stanford-corenlp-python

您还可以尝试编写自己的依赖项语法,NLTK为此提供了一个API(请查找"5依赖项和依赖项语法"一章):

You can also try writing your own dependency grammar, NLTK offers an API for that (look for chapter "5 Dependencies and Dependency Grammar"): http://www.nltk.org/book/ch08.html

这篇关于获取副词和形容词的对应动词和名词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆