如何使用NLTK ne_chunk提取GPE(位置)? [英] How can I extract GPE(location) using NLTK ne_chunk?

查看:149
本文介绍了如何使用NLTK ne_chunk提取GPE(位置)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用OpenWeatherMap API和NLTK实施代码来检查特定区域的天气状况,以查找实体名称识别.但是我找不到将GPE(给出位置)中存在的实体(在本例中为Chicago)传递给我的API请求的方法.请帮助我的语法.下面给出的代码.

I am trying to implement a code to check for the weather condition of a particular area using OpenWeatherMap API and NLTK to find entity name recognition. But I am not able to find the method of passing the entity present in GPE(that gives the location), in this case, Chicago, to my API request. Kindly help me with the syntax.The code to given below.

谢谢您的帮助

import nltk
from nltk import load_parser
import requests
import nltk
from nltk import word_tokenize
from nltk.corpus import stopwords

sentence = "What is the weather in Chicago today? "
tokens = word_tokenize(sentence)

stop_words = set(stopwords.words('english'))

clean_tokens = [w for w in tokens if not w in stop_words]

tagged = nltk.pos_tag(clean_tokens)

print(nltk.ne_chunk(tagged))

推荐答案

GPE是来自预先训练的ne_chunk模型的Tree对象的标签.

The GPE is a Tree object's label from the pre-trained ne_chunk model.

>>> from nltk import word_tokenize, pos_tag, ne_chunk
>>> sent = "What is the weather in Chicago today?"
>>> ne_chunk(pos_tag(word_tokenize(sent)))
Tree('S', [('What', 'WP'), ('is', 'VBZ'), ('the', 'DT'), ('weather', 'NN'), ('in', 'IN'), Tree('GPE', [('Chicago', 'NNP')]), ('today', 'NN'), ('?', '.')])

要遍历树,请参阅如何遍历NLTK树对象?

也许,您正在寻找对 NLTK将实体识别命名为Python列表

from nltk import word_tokenize, pos_tag, ne_chunk
from nltk import Tree

def get_continuous_chunks(text, label):
    chunked = ne_chunk(pos_tag(word_tokenize(text)))
    prev = None
    continuous_chunk = []
    current_chunk = []

    for subtree in chunked:
        if type(subtree) == Tree and subtree.label() == label:
            current_chunk.append(" ".join([token for token, pos in subtree.leaves()]))
        elif current_chunk:
            named_entity = " ".join(current_chunk)
            if named_entity not in continuous_chunk:
                continuous_chunk.append(named_entity)
                current_chunk = []
        else:
            continue

    return continuous_chunk

[输出]:

>>> sent = "What is the weather in New York today?"
>>> get_continuous_chunks(sent, 'GPE')
['New York']

>>> sent = "What is the weather in New York and Chicago today?"
>>> get_continuous_chunks(sent, 'GPE')
['New York', 'Chicago']

这篇关于如何使用NLTK ne_chunk提取GPE(位置)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆