Python:提取包含特定单词的句子 [英] Python: extracting a sentence with a particular word

查看:1185
本文介绍了Python:提取包含特定单词的句子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含以下文本的json文件:

I have a json file containing texts like:

博士戈德堡提供一切.停车场很好.他很友好,很容易 说话

dr. goldberg offers everything.parking is good.he's nice and easy to talk

如何提取关键字为"parking"的句子? 我不需要另外两个句子.

How can I extract the sentence with the keyword "parking"? I don't need the other two sentences.

我尝试过:

with open("test_data.json") as f:
    for line in f:
        if "parking" in line:
            print line

它将打印所有文本,而不是特定的句子.

It prints all the text and not that particular sentence.

我什至尝试使用正则表达式:

I even tried using regex:

f=open("test_data.json")
for line in f:
    line=line.rstrip()
    if re.search('parking',line):
        print line

即使显示相同的结果.

推荐答案

您可以使用nltk.tokenize:

from nltk.tokenize import sent_tokenize
from nltk.tokenize import word_tokenize
f=open("test_data.json").read()
sentences=sent_tokenize(f)
my_sentence=[sent for sent in sentences if 'parking' in word_tokenize(sent)] #this gave you the all sentences that your special word is in it ! 

作为一种完整的方法,您可以使用功能:

and as a complete way you can use a function :

>>> def sentence_finder(text,word):
...    sentences=sent_tokenize(text)
...    return [sent for sent in sentences if word in word_tokenize(sent)]

>>> s="dr. goldberg offers everything. parking is good. he's nice and easy to talk"
>>> sentence_finder(s,'parking')
['parking is good.']

这篇关于Python:提取包含特定单词的句子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆