Python中多线程编程的优点是什么? [英] What are the advantages of multithreaded programming in Python?

查看:110
本文介绍了Python中多线程编程的优点是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我听说多线程编程时,我想到了加速程序的机会,但是不是吗?

When I hear about multithreaded programming, I think about the opportunity to accelerate my program, but it is not?

import eventlet
from eventlet.green import socket
from iptools import IpRangeList


class Scanner(object):
    def __init__(self, ip_range, port_range, workers_num):
        self.workers_num = workers_num or 1000
        self.ip_range = self._get_ip_range(ip_range)
        self.port_range = self._get_port_range(port_range)
        self.scaned_range = self._get_scaned_range()

    def _get_ip_range(self, ip_range):
        return [ip for ip in IpRangeList(ip_range)]

    def _get_port_range(self, port_range):
        return [r for r in range(*port_range)]

    def _get_scaned_range(self):
        for ip in self.ip_range:
            for port in self.port_range:
                yield (ip, port)

    def scan(self, address):
        try:
            return bool(socket.create_connection(address))
        except:
            return False

    def run(self):
        pool = eventlet.GreenPool(self.workers_num)
        for status in pool.imap(self.scan, self.scaned_range):
            if status:
                yield True

    def run_std(self):
        for status in map(self.scan, self.scaned_range):
            if status:
                yield True


if __name__ == '__main__':
    s = Scanner(('127.0.0.1'), (1, 65000), 100000)
    import time
    now = time.time()
    open_ports = [i for i in s.run()]
    print 'Eventlet time: %s (sec) open: %s' % (now - time.time(),
                                                len(open_ports))
    del s
    s = Scanner(('127.0.0.1'), (1, 65000), 100000)
    now = time.time()
    open_ports = [i for i in s.run()]
    print 'CPython time: %s (sec) open: %s' % (now - time.time(),
                                                len(open_ports))

结果:

Eventlet time: -4.40343403816 (sec) open: 2
CPython time: -4.48356699944 (sec) open: 2

我的问题是,如果我不在笔记本电脑上而是在服务器上运行此代码,并且设置了更多的worker值,它将比CPython的版本运行得更快吗? 线程有什么优点?

And my question is, if I run this code is not on my laptop but on the server and set more value of workers it will run faster than the CPython's version? What are the advantages of threads?

添加: 因此,我使用原始cpython的线程重写了应用程序

ADD: And so I rewrite app with use original cpython's threads

import socket
from threading import Thread
from Queue import Queue

from iptools import IpRangeList

class Scanner(object):
    def __init__(self, ip_range, port_range, workers_num):
        self.workers_num = workers_num or 1000
        self.ip_range = self._get_ip_range(ip_range)
        self.port_range = self._get_port_range(port_range)
        self.scaned_range = [i for i in self._get_scaned_range()]

    def _get_ip_range(self, ip_range):
        return [ip for ip in IpRangeList(ip_range)]

    def _get_port_range(self, port_range):
        return [r for r in range(*port_range)]

    def _get_scaned_range(self):
        for ip in self.ip_range:
            for port in self.port_range:
                yield (ip, port)

    def scan(self, q):
        while True:
            try:
                r = bool(socket.create_conection(q.get()))
            except Exception:
                r = False
            q.task_done()

    def run(self):
        queue = Queue()
        for address in self.scaned_range:
                queue.put(address)
        for i in range(self.workers_num):
                worker = Thread(target=self.scan,args=(queue,))
                worker.setDaemon(True)
                worker.start()
        queue.join()


if __name__ == '__main__':
    s = Scanner(('127.0.0.1'), (1, 65000), 5)
    import time
    now = time.time()
    s.run()
    print time.time() - now

结果是

 Cpython's thread: 1.4 sec

我认为这是一个很好的结果.我将nmap扫描时间作为标准时间:

And I think this is a very good result. I take as a standard nmap scanning time:

$ nmap 127.0.0.1 -p1-65000

Starting Nmap 5.21 ( http://nmap.org ) at 2012-10-22 18:43 MSK
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00021s latency).
Not shown: 64986 closed ports
PORT      STATE SERVICE
53/tcp    open  domain
80/tcp    open  http
443/tcp   open  https
631/tcp   open  ipp
3306/tcp  open  mysql
6379/tcp  open  unknown
8000/tcp  open  http-alt
8020/tcp  open  unknown
8888/tcp  open  sun-answerbook
9980/tcp  open  unknown
27017/tcp open  unknown
27634/tcp open  unknown
28017/tcp open  unknown
39900/tcp open  unknown

Nmap done: 1 IP address (1 host up) scanned in 0.85 seconds

现在我的问题是:据我所知,如何在Eventlet中实现线程不是线程,而是Eventlet特有的东西,为什么它们不能加速任务?

And my question is now: how threads implemented in Eventlet as I can understand this is not threads but something special for Eventlet and why they dont speed up tasks?

Eventlet由许多主要项目(如OpenStack等)使用. 但为什么?只是以异步方式或其他方式对数据库执行繁重的查询吗?

Eventlet is used by many of the major projects like OpenStack and etc. But why? Just do the heavy queries to a DB in asynchronous manner or something else?

推荐答案

Cpython线程:

  • 每个cpython线程都映射到一个OS级线程(用户空间中的轻量级进程/pthread)

  • Each cpython thread maps to an OS level thread (lightweight process/pthread in user space)

如果有许多同时执行python代码的cpython线程:由于全局解释器锁定,只有一个cpython线程一次可以解释python.其余线程在需要解释python指令时将在GIL上被阻止.当有很多python线程时,这会使事情减慢很多.

If there are many cpython threads executing python code concurrently: due to the global interpreter lock, only one cpython thread can interpret python at one time. The remaining threads will be blocked on the GIL when they need to interpret python instructions. When there are many python threads this slows things down a lot.

现在,如果您的python代码将大部分时间都花在网络操作(发送,连接等)上:在这种情况下,争用GIL解释代码的线程将更少.因此,GIL的效果还不错.

Now if your python code is spending most of its time inside networking operations (send, connect, etc): in this case there will be less threads fighting for GIL to interpret code. So the effect of GIL is not so bad.

事件/绿色线程:

  • 从上面我们知道cpython对线程的性能限制. Eventlets通过使用在单个内核上运行的单个线程并对所有内容使用非阻塞I/O来尝试解决该问题.

  • From above we know that cpython has a performance limitation with threads. Eventlets tries to solve the problem by using a single thread running on a single core and using non blocking i/o for everything.

绿色线程不是真正的OS级别线程.它们是并发的用户空间抽象.最重要的是,N个绿色线程将映射到1个OS线程.这样可以避免GIL问题.

Green threads are not real OS level threads. They are a user space abstraction for concurrency. Most importantly, N green threads will map to 1 OS thread. This avoids the GIL problem.

绿色线程相互协作,而不是抢先调度. 对于网络操作,套接字库会在运行时进行修补(猴子修补),以便所有调用都不会阻塞.

Green threads cooperatively yield to each other instead of preemptively being scheduled. For networking operations, the socket libraries are patched in run time (monkey patching) so that all calls are non-blocking.

因此,即使创建了一个eventlet绿色线程池,您实际上也只创建了一个OS级线程.这个单一的OS级线程将执行所有eventlet.这个想法是,如果所有网络调用都是非阻塞的,那么在某些情况下,它应该比python线程快.

So even when you create a pool of eventlet green threads, you are actually creating only one OS level thread. This single OS level thread will execute all the eventlets. The idea is that if all the networking calls are non blocking, this should be faster than python threads, in some cases.

摘要

对于上述程序,"true"并发碰巧(cpython版本,在多个处理器上运行5个线程)比eventlet模型(在1个处理器上运行的单线程)更快.

For your program above, "true" concurrency happens to be faster (cpython version, 5 threads running on multiple processors ) than the eventlet model (single thread running on 1 processor.).

有些cpython工作负载会在许多线程/核心上表现不佳(例如,如果您有100个客户端连接到服务器,而每个客户端只有一个线程). Eventlet是用于此类工作负载的优雅编程模型,因此已在多个地方使用.

There are some cpython workloads that will perform badly on many threads/cores (e.g. if you have 100 clients connecting to a server, and one thread per client). Eventlet is an elegant programming model for such workloads, so its used in several places.

这篇关于Python中多线程编程的优点是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆