如何从流程或线程实例返回值? [英] How to return values from Process- or Thread instances?

查看:76
本文介绍了如何从流程或线程实例返回值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我想运行一个可以在网络上搜索信息或直接从我自己的mysql数据库中搜索信息的功能. 第一个过程很耗时,第二个过程相对较快.

So I want to run a function which can either search for information on the web or directly from my own mysql database. The first process will be time-consuming, the second relatively fast.

考虑到这一点,我创建了一个开始此复合搜索(find_compound_view)的过程.如果该过程完成得相对较快,则表示该过程已存在于数据库中,因此我可以立即呈现结果.否则,我将渲染"drax_retrieving_data.html".

With this in mind I create a process which starts this compound search (find_compound_view). If the process finishes relatively fast it means it's present on the database so I can render the results immediately. Otherwise, I will render "drax_retrieving_data.html".

我想到的愚蠢解决方案是两次运行该函数,一次检查该过程是否花费很长时间,另一次实际获取该函数的返回值.这几乎是因为我不知道如何返回find_compound_view函数的值.我已经尝试使用谷歌搜索,但似乎找不到如何从类Process中专门返回值的方法.

The stupid solution I came up with was to run the function twice, once to check if the process takes a long time, the other to actually get the return values of the function. This is pretty much because I don't know how to return the values of my find_compound_view function. I've tried googling but I can't seem to find how to return the values from the class Process specifically.

   p = Process(target=find_compound_view, args=(form,))
        p.start()
        is_running = p.is_alive()
        start_time=time.time()
        while is_running:
            time.sleep(0.05)
            is_running = p.is_alive()
            if time.time() - start_time > 10 :
                print('Timer exceeded, DRAX is retrieving info!',time.time() - start_time)
                return render(request,'drax_internal_dbs/drax_retrieving_data.html')
        compound = find_compound_view(form,use_email=False)

   if compound:
      data=*****
      return  render(request, 'drax_internal_dbs/result.html',data)

推荐答案

您将需要multiprocessing.Pipemultiprocessing.Queue才能将结果发送回父进程.如果仅执行I/0,则应使用Thread而不是Process,因为它更轻巧,并且大多数时间都花在等待上.我将向您展示通常如何处理流程和线程.

You will need a multiprocessing.Pipe or a multiprocessing.Queue to send the results back to your parent-process. If you just do I/0, you should use a Thread instead of a Process, since it's more lightweight and most time will be spend on waiting. I'm showing you how it's done for Process and Threads in general.

使用队列处理

多处理队列建立在管道之上,访问与锁/信号量同步.队列是线程和进程安全的,这意味着您可以将一个队列用于多个生产者/消费者进程,甚至这些进程中的多个线程.在队列中添加第一项也会在调用过程中启动一个供料器线程. multiprocessing.Queue的额外开销使得在单生产者/单消费者场景中使用管道更为可取,并且性能更高.

The multiprocessing queue is build on top of a pipe and access is synchronized with locks/semaphores. Queues are thread- and process-safe, meaning you can use one queue for multiple producer/consumer-processes and even multiple threads in these processes. Adding the first item on the queue will also start a feeder-thread in the calling process. The additional overhead of a multiprocessing.Queue makes using a pipe for single-producer/single-consumer scenarios preferable and more performant.

以下是使用multiprocessing.Queue发送和检索结果的方法:

Here's how to send and retrieve a result with a multiprocessing.Queue:

from multiprocessing import Process, Queue

SENTINEL = 'SENTINEL'

def sim_busy(out_queue, x):
    for _ in range(int(x)):
        assert 1 == 1
    result = x
    out_queue.put(result)
    # If all results are enqueued, send a sentinel-value to let the parent know
    # no more results will come.
    out_queue.put(SENTINEL)


if __name__ == '__main__':

    out_queue = Queue()

    p = Process(target=sim_busy, args=(out_queue, 150e6))  # 150e6 == 150000000.0
    p.start()

    for result in iter(out_queue.get, SENTINEL):  # sentinel breaks the loop
        print(result)

将队列作为参数传递给函数,结果是队列上的.put()和队列中的父get.(). .get()是阻塞调用,直到要获取 (指定超时参数是可能的)之前,执行不会恢复.请注意,sim_busy在这里所做的工作是CPU密集型的,那时候您将选择进程而不是线程.

The queue is passed as argument into the function, results are .put() on the queue and the parent get.()s from the queue. .get() is a blocking call, execution does not resume until something is to get (specifying timeout parameter is possible). Note the work sim_busy does here is cpu-intensive, that's when you would choose processes over threads.

流程和管道

对于一对一连接,管道就足够了.设置几乎相同,只是方法的名称不同,并且对Pipe()的调用返回了两个连接对象.在双工模式下,两个对象都是读写端,使用duplex=False(简单),第一个连接对象是管道的读取端,第二个对象是写入端.在这种基本情况下,我们只需要一个单纯形管道:

For one-to-one connections a pipe is enough. The setup is nearly identical, just the methods are named differently and a call to Pipe() returns two connection objects. In duplex mode, both objects are read-write ends, with duplex=False (simplex) the first connection object is the read-end of the pipe, the second is the write-end. In this basic scenario we just need a simplex-pipe:

from multiprocessing import Process, Pipe

SENTINEL = 'SENTINEL'


def sim_busy(write_conn, x):
    for _ in range(int(x)):
        assert 1 == 1
    result = x
    write_conn.send(result)
    # If all results are send, send a sentinel-value to let the parent know
    # no more results will come.
    write_conn.send(SENTINEL)


if __name__ == '__main__':

    # duplex=False because we just need one-way communication in this case.
    read_conn, write_conn = Pipe(duplex=False)

    p = Process(target=sim_busy, args=(write_conn, 150e6))  # 150e6 == 150000000.0
    p.start()

    for result in iter(read_conn.recv, SENTINEL):  # sentinel breaks the loop
        print(result)


线程&排队

要与线程一起使用,您想切换到queue.Queue. queue.Queue构建在collections.deque的顶部,并添加了一些锁以使其成为线程安全的.与多重处理的队列和管道不同,放置在queue.Queue上的对象不会被腌制.由于线程共享相同的内存地址空间,因此不需要进行用于内存复制的序列化,因此仅传输指针.

For use with threading, you want to switch to queue.Queue. queue.Queue is build on top of a collections.deque, adding some locks to make it thread-safe. Unlike with multiprocessing's queue and pipe, objects put on a queue.Queue won't get pickled. Since threads share the same memory address-space, serialization for memory-copying is unnecessary, only pointers are transmitted.

from threading import Thread
from queue import Queue
import time

SENTINEL = 'SENTINEL'


def sim_io(out_queue, query):
    time.sleep(1)
    result = query + '_result'
    out_queue.put(result)
    # If all results are enqueued, send a sentinel-value to let the parent know
    # no more results will come.
    out_queue.put(SENTINEL)


if __name__ == '__main__':

    out_queue = Queue()

    p = Thread(target=sim_io, args=(out_queue, 'my_query'))
    p.start()

    for result in iter(out_queue.get, SENTINEL):  # sentinel-value breaks the loop
        print(result)


  • 此处阅读为什么for result in iter(out_queue.get, SENTINEL): 在可能的情况下,应优先使用while True...break设置.
  • 此处阅读为什么在所有脚本中(尤其是在多处理中)都应使用if __name__ == '__main__':的原因.
  • 有关get()-用法的更多信息此处.

    • Read here why for result in iter(out_queue.get, SENTINEL): should be prefered over a while True...break setup, where possible.
    • Read here why you should use if __name__ == '__main__': in all your scripts and especially in multiprocessing.
    • More about get()-usage here.
    • 这篇关于如何从流程或线程实例返回值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆