要分叉还是不分叉? [英] To fork or not to fork?
问题描述
我正在重新开发一个系统,该系统将通过http向许多供应商之一发送消息.原始的是perl脚本,重新开发后可能还会使用perl.
I am re-developing a system that will send messages via http to one of a number of suppliers. The original is perl scripts and it's likely that the re-development will also use perl.
在旧系统中,有许多同时运行的perl脚本,每个供应商都有5个.将消息放入数据库时,选择了一个随机线程号(1-5),并选择了供应商以确保没有两次处理任何消息,同时避免锁定表/行.此外,数据库中还有一个公平队列位置"字段,以确保大型邮件发送不会延迟大型邮件发送时发生的小型发送.
In the old system, there were a number of perl scripts all running at the same time, five for each supplier. When a message was put into the database, a random thread number (1-5) and the supplier was chosen to ensure that no message was processed twice while avoiding having to lock the table/row. Additionally there was a "Fair Queue Position" field in the database to ensure that a large message send didn't delay small sends that happened while the large one was being sent.
有时每分钟只有几条消息,但有时可能会转储成千上万条消息.在我看来,一直运行所有脚本并检查消息一直是一种资源浪费,因此我试图确定是否有更好的方法或旧方法可以接受.
At some times there would be just a couple of messages per minute, but at other times there would be a dump of potentially hundreds of thousands of messages. It seems to me like a resource waste to have all the scripts running and checking for messages all of the time so I am trying to work out if there is a better way to do it, or if the old way is acceptable.
我现在的想法是让一个脚本运行并派生所需的子进程(最多有一个子进程),具体取决于有多少流量,但是我不确定如何最好地实现它这样,每条消息仅处理一次,同时保持公平的排队.
My thoughts right now lie with the idea of having one script that runs and forks as many child processes as are needed (up to a limit) depending on how much traffic there is, but I am not sure how best to implement it such that each message is processed only once, while the fair queuing is maintained.
我现在最好的猜测是父脚本会更新数据库以指示哪个子进程应该处理它,但是我担心这样做最终会比原始方法效率低.我几乎没有编写分叉代码的经验(上一次是大约15年前).
My best guess right now is that the parent script updates the DB to indicate which child process should handle it, however I am concerned that this will end up being less efficient than the original method. I have little experience of writing forking code (last time I did it was about 15 years ago).
对如何最好地处理消息队列的任何想法或指南链接表示赞赏!
Any thoughts or links to guides on how best to process message queues appreciated!
推荐答案
You could use Thread::Queue or any other from this: Is there a multiprocessing module for Perl?
如果旧系统是用Perl编写的,则可以重用其中的大部分内容.
If the old system was written in Perl this way you could reuse most part of it.
非工作示例:
use strict;
use warnings;
use threads;
use Thread::Queue;
my $q = Thread::Queue->new(); # A new empty queue
# Worker thread
my @thrs = threads->create(sub {
while (my $item = $q->dequeue()) {
# Do work on $item
}
})->detach() for 1..10;#for 10 threads
my $dbh = ...
while (1){
#get items from db
my @items = get_items_from_db($dbh);
# Send work to the thread
$q->enqueue(@items);
print "Pending items: "$q->pending()."\n";
sleep 15;#check DB in every 15 secs
}
这篇关于要分叉还是不分叉?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!