何时使用线程以及使用多少线程 [英] When to use threading and how many threads to use

查看:59
本文介绍了何时使用线程以及使用多少线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个工作项目.我们已经编写了一个模块,并以#TODO的形式实现了用于改进该模块的线程.我是一个相当新的python程序员,因此决定要大吃一惊.在学习和实现线程时,我遇到类似于>多少线程的问题? 因为我们有大约6个需要处理的对象的队列,所以当处理时间可以忽略不计时,为什么要用6个线程(或根本没有任何线程)来处理列表或队列中的对象? (每个对象最多需要大约2秒的时间来处理)

I have a project for work. We had written a module and there as a #TODO to implement threading to improve the module. I'm a fairly new python programmer and decided to take a whack at it. While learning and implementing the threading, I had the question similar to How many threads is too many? because we have a queue of about maybe 6 objects that need to be processed, so why make 6 threads (or any threads at all) to process objects in a list or queue when the processing time is negligible anyway? (Each object takes at most about 2 seconds to process)

所以我做了一个小实验.我想知道使用线程是否可以提高性能.请参阅下面的python代码:

So I ran a little experiment. I wanted to know if there were performance gains from using threading. See my python code below:

import threading
import queue
import math
import time

results_total = []
results_calculation = []
results_threads = []

class MyThread(threading.Thread):
    def __init__(self, thread_id, q):
        threading.Thread.__init__(self)
        self.threadID = thread_id
        self.q = q

    def run(self):
        # print("Starting " + self.name)
        process_data(self.q)
        # print("Exiting " + self.name)


def process_data(q):
    while not exitFlag:
        queueLock.acquire()
        if not workQueue.empty():
            potentially_prime = True
            data = q.get()
            queueLock.release()
            # check if the data is a prime number
            # print("Testing {0} for primality.".format(data))
            for i in range(2, int(math.sqrt(data)+1)):
                if data % i == 0:
                    potentially_prime = False
                    break
            if potentially_prime is True:
                prime_numbers.append(data)
        else:
            queueLock.release()

for j in [1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100, 150, 250, 500,
          750, 1000, 2500, 5000, 10000]:
    threads = []
    numberList = list(range(1, 10001))
    queueLock = threading.Lock()
    workQueue = queue.Queue()
    numberThreads = j
    prime_numbers = list()
    exitFlag = 0

    start_time_total = time.time()
    # Create new threads
    for threadID in range(0, numberThreads):
        thread = MyThread(threadID, workQueue)
        thread.start()
        threads.append(thread)

    # Fill the queue
    queueLock.acquire()
    # print("Filling the queue...")
    for number in numberList:
        workQueue.put(number)
    queueLock.release()
    # print("Queue filled...")
    start_time_calculation = time.time()
    # Wait for queue to empty
    while not workQueue.empty():
        pass

    # Notify threads it's time to exit
    exitFlag = 1

    # Wait for all threads to complete
    for t in threads:
        t.join()
    # print("Exiting Main Thread")
    # print(prime_numbers)
    end_time = time.time()
    results_total.append(
            "The test took {0} seconds for {1} threads.".format(
                end_time - start_time_total, j)
            )
    results_calculation.append(
            "The calculation took {0} seconds for {1} threads.".format(
                    end_time - start_time_calculation, j)
            )
    results_threads.append(
            "The thread setup time took {0} seconds for {1} threads.".format(
                    start_time_calculation - start_time_total, j)
            )
for result in results_total:
    print(result)
for result in results_calculation:
    print(result)
for result in results_threads:
    print(result)

此测试找到1到10000之间的质数.此设置几乎直接来自 https://www.tutorialspoint.com/python3/python_multithreading.htm ,但我没有打印简单的字符串,而是让线程查找素数.这实际上不是我的实际应用程序,但是我目前无法测试为该模块编写的代码.我认为这是衡量附加线程效果的良好测试.我的现实世界应用程序涉及与多个串行设备通信.我进行了5次测试并取平均值.这是图形中的结果:

This test finds the prime numbers between 1 and 10000. This set up is pretty much taken right from https://www.tutorialspoint.com/python3/python_multithreading.htm but instead of printing a simple string I ask the threads to find prime numbers. This is not actually what my real world application is but I can't currently test the code I've written for the module. I thought this was a good test to measure the effect of additional threads. My real world application deals with talking to multiple serial devices. I ran the test 5 times and averaged the times. Here are the results in a graph:

关于线程和此测试的我的问题如下:

My questions regarding threading and this test are as follows:

  1. 此测试是否很好地表示了应如何使用线程?这不是服务器/客户端的情况.在效率方面,当您不为客户服务或不处理分配到工作/队列中的工作时,是否最好避免并行处理?

  1. Is this test even a good representation of how threads should be used? This is not a server/client situation. In terms of efficiency, is it better to avoid parallelism when you aren't serving clients or dealing with assignments/work being added to a queue?

如果对1的回答为否,那么该测试不是一个应该使用线程的地方".那什么时候一般来说.

If the answer to 1 is "No, this test isn't a place where one should use threads." then when is? Generally speaking.

如果对1的回答是是的,在这种情况下可以使用线程.",为什么添加线程最终会花费更长的时间并很快达到稳定状态?而是为什么要使用线程,因为它比在循环中计算线程要花费更多的时间.

If the answer to 1 is "Yes, this is ok to use threads in that case.", why does adding threads end up taking longer and quickly reaches a plateau? Rather, why would one want to use threads as it takes many times longer than calculating it in a loop.

我注意到,随着工作线程比例接近1:1,建立线程所花费的时间变得更长.那么,线程仅在您一次创建线程并尽可能长地保持它们存活的情况下有用吗,以便处理可能比其计算速度更快的入队请求?

I notice that as the work to threads ratio gets closer to 1:1, the time taken to set up the threads becomes longer. So is threading only useful where you create threads once and keep them alive as long as possible to handle requests that might enqueue faster than they can be calculated?

推荐答案

不,这不是使用线程的好地方.

No, this is not a good place to use threads.

通常,您想使用代码受IO约束的线程;也就是说,它花费大量时间等待输入或输出.一个示例可能是从URL列表中并行下载数据.该代码可以开始从下一个URL请求数据,同时仍在等待上一个URL返回.

Generally, you want to use threads where your code is IO-bound; that is, it spends a significant amount of time waiting on input or output. An example might be downloading data from a list of URLs in parallel; the code can start requesting the data from the next URL while still waiting for the previous one to return.

这里不是这样;计算素数受CPU约束.

That's not the case here; calculating primes is cpu-bound.

这篇关于何时使用线程以及使用多少线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆