任何线程完成任务时终止多个线程 [英] Terminate multiple threads when any thread completes a task

查看:102
本文介绍了任何线程完成任务时终止多个线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对python和线程都是陌生的.我已经编写了充当网络爬虫的python代码,并在网站中搜索特定的关键字.我的问题是,如何使用线程同时运行类的三个不同实例.当实例之一找到关键字时,所有三个实例都必须关闭并停止爬网.这是一些代码.

I am new to both python, and to threads. I have written python code which acts as a web crawler and searches sites for a specific keyword. My question is, how can I use threads to run three different instances of my class at the same time. When one of the instances finds the keyword, all three must close and stop crawling the web. Here is some code.

class Crawler:
      def __init__(self):
            # the actual code for finding the keyword 

 def main():  
        Crawl = Crawler()

 if __name__ == "__main__":
        main()

如何使用线程让Crawler同时执行三个不同的爬网?

How can I use threads to have Crawler do three different crawls at the same time?

推荐答案

似乎没有一种(简单的)方法可以终止Python中的线程.

There doesn't seem to be a (simple) way to terminate a thread in Python.

这是一个并行运行多个HTTP请求的简单示例:

Here is a simple example of running multiple HTTP requests in parallel:

import threading

def crawl():
    import urllib2
    data = urllib2.urlopen("http://www.google.com/").read()

    print "Read google.com"

threads = []

for n in range(10):
    thread = threading.Thread(target=crawl)
    thread.start()

    threads.append(thread)

# to wait until all three functions are finished

print "Waiting..."

for thread in threads:
    thread.join()

print "Complete."

使用额外的开销,您可以使用多处理方法,该方法功能更强大,允许您终止类似线程的进程.

With additional overhead, you can use a multi-process aproach that's more powerful and allows you to terminate thread-like processes.

我已经扩展了示例以使用它.我希望这对您有帮助:

I've extended the example to use that. I hope this will be helpful to you:

import multiprocessing

def crawl(result_queue):
    import urllib2
    data = urllib2.urlopen("http://news.ycombinator.com/").read()

    print "Requested..."

    if "result found (for example)":
        result_queue.put("result!")

    print "Read site."

processs = []
result_queue = multiprocessing.Queue()

for n in range(4): # start 4 processes crawling for the result
    process = multiprocessing.Process(target=crawl, args=[result_queue])
    process.start()
    processs.append(process)

print "Waiting for result..."

result = result_queue.get() # waits until any of the proccess have `.put()` a result

for process in processs: # then kill them all off
    process.terminate()

print "Got result:", result

这篇关于任何线程完成任务时终止多个线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆