并行MATLAB和日志记录 [英] Parallel MATLAB and logging

查看:147
本文介绍了并行MATLAB和日志记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用并行计算工具箱运行分布在多台计算机上的实验.我希望能够生成实验进度(或发生的任何错误)的日志,并在进程运行时将此信息保存在文件中.这样做的标准方法是什么?

I am running an experiment distributed over several computers using the Parallel computing toolbox. I want to be able to produce a log of the progress of the experiment (or of any error occurring) and save this info in a file while the processes are running. What is the standard way to do it?

  1. 我尴尬地使用了并行
  2. 我只需要一个文件供所有工作人员使用(我有一个可以从所有计算机访问的网络驱动器)

我主要关心的是打开一个文件供几个工作人员附加.我是否有丢失消息或打开文件时出错的风险?

My main concern is having a file opened for append by several workers. Do I risk losing messages, or having an error opening the file?

推荐答案

当多个进程输出到单个文件时,您可以遇到一些潜在的问题,例如消息被覆盖或混合.我曾经在其他语言(例如C)的程序中发生过这种情况,我假设在MATLAB中可能会出现相同的问题,但是我自由地承认我对此可能是错的.假设我不是错了...

When multiple processes output to a single file, you could run into some potential problems, like messages being overwritten or intermingled. I've had this happen with programs in other languages (like C), and I assume the same problem could arise in MATLAB, but I freely admit I could be wrong about this. Assuming I'm not wrong...

如果要在进程运行时将多个工作进程中的数据可靠地输出到单个日志文件,执行此操作的一种方法是让一个进程负责所有文件操作(即主"进程) . 主"进程将从其他工作程序(即从站")收集消息,并将此数据输出到日志文件.

If you want to reliably output data from multiple worker processes to a single log file while the processes are running, one way to do this is to make one process be responsible for all the file operations (i.e. a "master" process). The "master" process would collect messages from the other workers (i.e. "slaves") and output this data to the log file.

由于我不知道您要具体执行每个进程的内容,因此很难建议您进行特定的代码更改.以下是一些步骤和示例代码,说明了如何在MATLAB中执行此操作.这些代码示例假定您在每个进程上运行相同的功能( process_fcn ):

Since I don't know what specifically you are having each process do, it's hard to suggest specific code changes to make. Here are some steps and sample code for how you might do this in MATLAB. These code samples assume you are running the same function (process_fcn) on each process:

  • 主"进程首先必须打开文件.此代码(使用 labindex 函数)应在开始时运行的 process_fcn :

  • The "master" process first has to open the file. This code (using the labindex function) should be run at the beginning of process_fcn:

if (labindex == 1),
  fid = fopen('log.txt','at');  %# Open text file for appending
end

  • 每个进程正在运行时,您可以在名为 data 的变量中收集需要输出到日志文件的任何数据,该变量存储字符串或字符数组.此数据可能是在 try-catch块或您希望存储在日志文件中的任何其他数据.

  • While each process is running, you can collect any data that needs to be output to the log file in a variable called data, which stores a string or character array. This data could be error messages captured within a try-catch block or any other data that you would want to be in the log file.

    process_fcn 中的周期点(完成主要任务或在计算循环内)时,您将必须让每个进程检查需要输出的数据(即, 数据不为空),并将该数据发送到主"进程.然后,主"进程将从其他进程以及其自身的任何进程收集并打印这些消息.以下是如何完成此操作的示例(使用功能 labBarrier labProbe

    At periodic points in process_fcn (either when major tasks are completed or within a loop of computation), you would have to have each process check for data that needs to be output (i.e. data is not empty) and have that data sent to the "master" process. The "master" process would then collect and print these messages from other processes, along with any of its own. Here's a sample of how this might be done (using the functions labBarrier, labProbe, labSend, and labReceive):

    labBarrier;  %# All processes are synchronized here
    if (labindex == 1),  %# This is done by the "master"
      if ~isempty(data),
        fprintf(fid,'%s\n',data);  %# Print "master" data
      end
      pause(1);  %# Wait a moment for "slaves" to send messages
      while labProbe,  %# Loop while messages are available
        data = labReceive;  %# Get data from "slaves"
        fprintf(fid,'%s\n',data);
      end
    else  %# This is done by the "slaves"
      if ~isempty(data),
        labSend(data,1);  %# Send data to the "master"
      end
    end
    data = '';  %# Clear data
    

    PAUSE 的调用可以确保这些调用 labSend ,以便在主"开始查找发送的消息之前进行每个从"过程.

    The call to PAUSE is there to ensure that the calls to labSend for each "slave" process occur before the "master" starts looking for sent messages.

    最后,主"进程必须关闭文件.此代码应在 process_fcn 的末尾运行:

    Finally, the "master" process has to close the file. This code should be run at the end of process_fcn:

    if (labindex == 1),
      fclose(fid);
    end
    

  • 这篇关于并行MATLAB和日志记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆