如何从子进程传递变量(由 Parallel::ForkManager fork)? [英] How to pass a variable from a child process (fork by Parallel::ForkManager)?

查看:33
本文介绍了如何从子进程传递变量(由 Parallel::ForkManager fork)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的查询:

在下面的代码中,我试图将打印 $commandoutput[0] 转移或传递到即将到来的子程序中.我尝试转移以传递它.但我失败了.你能帮我遵循正确的方法吗?

代码:

我的 $max_forks = 4;#createThreads();我的 %commandData;我的@arr = ('bhappy', 'bload -m all -l -res CPUSTEAL','bqueues', 'bjobs -u all -l -hfreq 101');#打印@arr;我的 $fork = new Parallel::ForkManager($max_forks);$fork->run_on_start(子{我的 $pid = shift;});$fork->run_on_finish(子{我的 ( $pid, $exit, $ident, $signal, $core ) = @_;如果($核心){打印 "PID $pid core dumped.
";}别的 { }});我的@Commandoutput;我的 $commandposition = 0;对于我的 $command (@arr) {$fork->start 和 next;我的@var = split(" ", $command );$commandoutput[$commandposition] = `$command`;$命令位置++;$line = $commandoutput[0];# 打印 $line;$fork->finish;}$fork->wait_all_children;#print Dumper(\%commandData);打印 $commandoutput[0];

在这里,我试图将打印 $commandoutput[0] 存储在子例程内的变量中.我在这里选择了如何将变量从子例程外部传递到内部.

sub gen_help_data{我的 $lines=shift;打印 $lines;}

解决方案

startfinish 之间的代码运行在一个单独的进程和子级和父级不能写入彼此的变量(即使名称相同).Forking 创建一个具有自己的内存和数据的独立进程.要在这些进程之间传递数据,我们需要使用进程间通信"(IPC) 机制.

此模块确实提供了一种现成且简单的方法来将数据从子级传递回父级.参见 从文档中的子进程中检索数据结构.

您首先需要为 finish 提供一个对孩子想要返回的数据结构的引用.在你的情况下,你想返回一个标量 $commandoutput[0] 所以这样做

$fork->finish(0, $commandoutput[0]);

然后在回调中找到此引用作为最后一个、第六个参数.您的代码遗漏的那个.所以在回调中你需要

我的 %ret_data;# 存储来自不同子进程的数据$pm->run_on_finish(子{我的 ($pid, $exit, $ident, $signal, $core, $dataref) = @_;$ret_data{$pid} = $dataref;});

这里的$dataref$commandoutput[0],它作为key的值存储在%ret_data中进程标识.所以在 foreach 完成后你可以在 %ret_data

中找到所有数据

foreach my $pid (keys %ret_data) {说来自 $pid 的数据 => ${$ret_data{$pid}}";}

这里我们将 $ret_data{$pid} 取消引用为标量引用,因为您的代码会返回它.

请注意,数据是通过写出文件传递的,如果有很多事情发生,这可能会很慢.

<小时>

这是一个完整的示例,其中每个子项通过将其传递给finish 返回一个数组引用,然后在回调中检索该引用.如需其他示例,请参阅这篇文章.>

使用警告;使用严格;使用功能说";使用 Parallel::ForkManager;我的 $pm = Parallel::ForkManager->new(4);我的 %ret_data;$pm->run_on_finish( sub {我的 ($pid, $exit, $ident, $signal, $core, $dataref) = @_;$ret_data{$pid} = $dataref;});foreach 我的 $i (1..8){$pm->开始和下一个;我的 $ref = run_job($i);$pm->finish(0, $ref);}$pm->wait_all_children;foreach 我的 $pid (keys %ret_data) {说$pid 返回:@{$ret_data{$pid}}";}子运行_作业{我的 ($i) = @_;返回 [ 1..$i ];# 组成返回数据:arrayref with list 1..$i}

印刷品

<前>15037 返回:1 2 3 4 5 6 715031 返回:1 215033 返回:1 2 3 415036 返回:1 2 3 4 5 615035 返回:1 2 3 4 515038 返回:1 2 3 4 5 6 7 815032 返回:1 2 315030 返回:1

<小时>

在现代系统上,由于性能原因,在新进程分叉时复制尽可能少的数据.因此,孩子通过分叉继承"的变量实际上并不是副本,因此孩子实际上确实读取了分叉时存在的父母变量.

但是,子进程写入在内存中的任何数据对父进程来说都是不可访问的(子进程在分叉后写入的内容对子进程来说是未知的).如果在分叉时将该数据写入从父级继承"的变量,则会发生数据复制,以便子级的新数据是独立的.

数据的管理方式肯定存在微妙之处和复杂性,显然即使子代中的数据发生变化,也会保留许多指针.我猜这主要是为了简化数据管理,并减少复制;数据管理的粒度似乎比变量"级别要精细得多.

但这些都是实现细节,一般情况下,孩子和父母不能互相查看对方的数据.

My query:

In the following code i had tried to bring the print $commandoutput[0] to be shifted or passed into the upcoming subroutine.i tried the shift to pass it.But i failed with it.Can you please help me the right way to follow?

Code:

my $max_forks = 4;

#createThreads();
my %commandData;
my @arr = (
   'bhappy',  'bload -m all -l -res CPUSTEAL',
   'bqueues', 'bjobs -u all -l -hfreq 101'
);

#print @arr;
my $fork = new Parallel::ForkManager($max_forks);
$fork->run_on_start(
   sub {
      my $pid = shift;
   }
);
$fork->run_on_finish(
   sub {
      my ( $pid, $exit, $ident, $signal, $core ) = @_;
      if ($core) {
         print "PID $pid core dumped.
";
      }
      else { }
   }
);
my @Commandoutput;
my $commandposition = 0;
for my $command (@arr) {
   $fork->start and next;
   my @var = split( " ", $command );
   $commandoutput[$commandposition] = `$command`;
   $commandposition++;
   $line = $commandoutput[0];

# print $line;
   $fork->finish;
}
$fork->wait_all_children;

#print Dumper(\%commandData);
print $commandoutput[0];

Here i had tried to store the print $commandoutput[0] in the variable inside the subroutine.I gated here how to pass the variables from outside to inside the subroutine.

sub gen_help_data
{
  my $lines=shift;
  print $lines;
}

解决方案

The code between start and finish runs in a separate process and the child and parent cannot write to each other's variables (even if with the same name). Forking creates an independent process with its own memory and data. To pass data between these processes we need to use an "Inter-Process-Communication" (IPC) mechanism.

This module does provide a ready and simple way to pass data back from a child to the parent. See Retrieving data structures from child processes in docs.

You first need to supply to finish a reference to the data structure that the child wants to return. In your case, you want to return a scalar $commandoutput[0] so do

$fork->finish(0, $commandoutput[0]);

This reference is then found in the callback as the last, sixth, parameter. The one your code left out. So in the callback you need

my %ret_data;  # to store data from different child processes

$pm->run_on_finish( 
    sub { 
        my ($pid, $exit, $ident, $signal, $core, $dataref) = @_; 
        $ret_data{$pid} = $dataref;
    }
);

Here $dataref is $commandoutput[0], which is stored in %ret_data as the value for the key which is the process id. So after the foreach completes you can find all data in %ret_data

foreach my $pid (keys %ret_data) {
    say "Data from $pid => ${$ret_data{$pid}}";
}

Here we dereference $ret_data{$pid} as a scalar reference, since your code returns that.

Note that the data is passed by writing out files and that can be slow if a lot is going on.


Here is a full example, where each child returns an array reference, by passing it tofinish, which is then retrieved in the callback. For a different example see this post.

use warnings;
use strict;
use feature 'say';

use Parallel::ForkManager;    
my $pm = Parallel::ForkManager->new(4); 

my %ret_data;

$pm->run_on_finish( sub { 
    my ($pid, $exit, $ident, $signal, $core, $dataref) = @_; 
    $ret_data{$pid} = $dataref;
});

foreach my $i (1..8)
{
    $pm->start and next;
    my $ref = run_job($i);
    $pm->finish(0, $ref);
}
$pm->wait_all_children;

foreach my $pid (keys %ret_data) {
    say "$pid returned: @{$ret_data{$pid}}";
}

sub run_job { 
    my ($i) = @_;
    return [ 1..$i ];  # make up return data: arrayref with list 1..$i
}

Prints

15037 returned: 1 2 3 4 5 6 7
15031 returned: 1 2
15033 returned: 1 2 3 4
15036 returned: 1 2 3 4 5 6
15035 returned: 1 2 3 4 5
15038 returned: 1 2 3 4 5 6 7 8
15032 returned: 1 2 3
15030 returned: 1


On modern systems as little data is copied as possible as a new process is forked, for performance reasons. So variables that a child "inherits" by forking aren't actually copies and thus the child does in fact read parent's variables that existed when it was forked.

However, any data that a child writes in memory is inaccessible to the parent (and what parent writes after forking is unknown to the child). If that data is written to a variable "inherited" from a parent at forking then a data copy happens so that the child's new data is independent.

There are certainly subtleties and complexities in how data is managed, with apparently a number of pointers maintained even as data changes in the child. I'd guess that this is mostly to simplify data management, and to reduce copying; there appears to be far finer granularity in data management than at a "variable" level.

But these are implementation details and in general child and parent can't poke at each other's data.

这篇关于如何从子进程传递变量(由 Parallel::ForkManager fork)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆