在两个python进程之间传递变量 [英] Passing variables between two python processes

查看:66
本文介绍了在两个python进程之间传递变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我打算制作一个如下所示的程序结构

I am intended to make a program structure like below

PS1 是一个持续运行的 python 程序.PC1、PC2、PC3 是客户端 python 程序.PS1 有一个变量哈希表,每当 PC1、PC2... 请求哈希表时,PS1 都会将它传递给它们.

PS1 is a python program persistently running. PC1, PC2, PC3 are client python programs. PS1 has a variable hashtable, whenever PC1, PC2... asks for the hashtable the PS1 will pass it to them.

目的是将表保存在内存中,因为它是一个巨大的变量(占用 10G 内存)并且每次计算它的成本都很高.将其存储在硬盘中(使用pickle或json)并在每次需要时读取它是不可行的.阅读时间太长了.

The intention is to keep the table in memory since it is a huge variable (takes 10G memory) and it is expensive to calculate it every time. It is not feasible to store it in the hard disk (using pickle or json) and read it every time when it is needed. The read just takes too long.

所以我想知道是否有一种方法可以将 python 变量持久地保存在内存中,以便在需要时可以非常快速地使用它.

So I was wondering if there is a way to keep a python variable persistently in the memory, so it can be used very fast whenever it is needed.

推荐答案

当漂亮的圆轮已经存在时,您正在尝试重新发明一个方轮!

You are trying to reinvent a square wheel, when nice round wheels already exist!

让我们再上一层,看看您如何描述您的需求:

Let's go one level up to how you have described your needs:

  • 一个大型数据集,构建成本高
  • 不同的进程需要使用数据集
  • 性能问题不允许简单地从永久存储中读取完整集

恕我直言,我们正面临着创建数据库的目的.对于常见的用例,让多个进程都使用自己的 10G 对象副本是一种内存浪费,常见的方式是一个进程拥有数据,其他进程发送数据请求.你没有充分描述你的问题,所以我不能说最好的解决方案是:

IMHO, we are exactly facing what databases were created for. For common use cases, having many processes all using their own copy of a 10G object is a memory waste, and the common way is that one single process have the data, and the others send requests for the data. You did not describe your problem enough, so I cannot say if the best solution will be:

  • 像 PostgreSQL 或 MariaDB 这样的 SQL 数据库 - 因为它们可以缓存,如果您有足够的内存,所有这些都将自动保存在内存中
  • 一个 NOSQL 数据库(MongoDB 等),如果您的唯一(或主要)需要是单键访问 - 在处理需要快速但简单访问的大量数据时非常好
  • 使用专用查询语言的专用服务器,如果您的需求非常具体并且上述解决方案均无法满足他们
  • 一个进程设置了一个巨大的共享内存,供客户端进程使用——最后一个解决方案肯定会是最快的:
    • 所有客户端都进行只读访问 - 它可以扩展为 r/w 访问,但可能导致同步噩梦
    • 您的系统肯定有足够的内存从不使用交换 - 如果这样做,您将失去真实数据库实现的所有缓存优化
    • 数据库的大小和客户端进程的数量以及整个系统的外部负载永远不会增加到您陷入上述交换问题的程度
    • a SQL database like PostgreSQL or MariaDB - as they can cache, if you have enough memory, all will be held automatically in memory
    • a NOSQL database (MongoDB, etc.) if your only (or main) need is single key access - very nice when dealing with lot of data requiring fast but simple access
    • a dedicated server using a dedicate query languages if your needs are very specific and none of the above solutions meet them
    • a process setting up a huge piece of shared memory that will be used by client processes - that last solution will certainly be fastest provided:
      • all clients make read-only accesses - it can be extended to r/w accesses but could lead to a synchronization nightmare
      • you are sure to have enough memory on your system to never use swap - if you do you will lose all the cache optimizations that real databases implement
      • the size of the database and the number of client process and the external load of the whole system never increase to a level where you fall in the swapping problem above

      TL/DR:我的建议是使用高质量的数据库和可选的专用缓存来试验性能.这些解决方案几乎允许在不同机器上进行开箱即用的负载平衡.只有当这不起作用时,仔细分析内存需求,并确保记录客户端进程数量和数据库大小的限制以供将来维护和使用共享内存 - 只读数据暗示共享内存可以是一个不错的解决方案

      TL/DR: My advice is to experiment what are the performances with a good quality database and optionaly a dedicated chache. Those solution allow almost out of the box load balancing on different machines. Only if that does not work carefully analyze the memory requirements and be sure to document the limits in number of client processes and database size for future maintenance and use shared memory - read-only data being an hint that shared memory can be a nice solution

      这篇关于在两个python进程之间传递变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆