在Python中同步多线程 [英] Synchronise muti-threads in Python

查看:70
本文介绍了在Python中同步多线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面的代码中的BrokenLinkTest类执行以下操作.

The class BrokenLinkTest in the code below does the following.

  1. 获取网页网址
  2. 找到网页中的所有链接
  3. 同时获取链接的标题(这样做是为了检查链接是否断开)
  4. 收到所有标头后,
  5. 打印完成".
  1. takes a web page url
  2. finds all the links in the web page
  3. get the headers of the links concurrently (this is done to check if the link is broken or not)
  4. print 'completed' when all the headers are received.


from bs4 import BeautifulSoup
import requests

class BrokenLinkTest(object):

    def __init__(self, url):
        self.url = url
        self.thread_count = 0
        self.lock = threading.Lock()

    def execute(self):
        soup = BeautifulSoup(requests.get(self.url).text)
        self.lock.acquire()
        for link in soup.find_all('a'):
            url = link.get('href')
            threading.Thread(target=self._check_url(url))
        self.lock.acquire()

    def _on_complete(self):
        self.thread_count -= 1
        if self.thread_count == 0: #check if all the threads are completed
            self.lock.release()
            print "completed"

    def _check_url(self, url):
        self.thread_count += 1
        print url
        result = requests.head(url)
        print result
        self._on_complete()


BrokenLinkTest("http://www.example.com").execute()

能否以更好的方式完成并发/同步部分.我使用threading.Lock做到了.这是我对python线程的第一个实验.

Can the concurrency/synchronization part be done in a better way. I did it using threading.Lock. This is my first experiment with python threading.

推荐答案

def execute(self):
    soup = BeautifulSoup(requests.get(self.url).text)
    threads = []
    for link in soup.find_all('a'):
        url = link.get('href')
        t = threading.Thread(target=self._check_url, args=(url,))
        t.start()
        threads.append(t)
    for thread in threads:
        thread.join()

您可以使用join方法来等待所有线程完成.

You could use the join method to wait for all the threads to finish.

注意,我还添加了一个start调用,并将绑定的方法对象传递给目标参数.在原始示例中,您正在主线程中调用_check_url并将返回值传递给目标参数.

Note I also added a start call, and passed the bound method object to the target param. In your original example you were calling _check_url in the main thread and passing the return value to the target param.

这篇关于在Python中同步多线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆