mclapply遇到错误,具体取决于核心ID? [英] mclapply encounters errors depending on core id?
问题描述
我有一组基因,需要为它们并行计算一些系数.
在GeneTo_GeneCoeffs_filtered
内部计算系数,该系数以基因名称作为输入并返回2个数据帧的列表.
具有100个长度的gene_array
我以不同数量的内核(5、6和7)运行了此命令.
Coeffslist=mclapply(gene_array,GeneTo_GeneCoeffs_filtered,mc.cores = no_cores)
我遇到不同的基因名称错误,具体取决于分配给mclapply
的核心数.
GeneTo_GeneCoeffs_filtered
无法返回其具有模式的数据框列表的基因的索引.
对于分配给mclapply的7个内核,它是gene_array
的4、11、18、25,... 95个元素(每7个),当R与6个内核一起工作时,索引为2、8、14. ..,98(每6个),以同样的方式使用5个内核-每5个.
最重要的是,它们在这些过程中是不同的,这意味着问题不在于特定的基因.
我怀疑可能有损坏的"内核无法正常运行我的功能,只有它会产生此错误.有没有办法追溯其ID并将其从R可以使用的内核列表中排除?
仔细阅读mclapply的
(b) 将为其中涉及的所有值返回一个"try-error"对象
失败,即使不是所有人都失败了. 在您的情况下,借助(a),您的gene_array在核心中散布为循环"样式(在连续元素的索引之间具有mc.cores的间隙),并且借助(b) ,如果任何gene_array元素引发错误,您将为发送到该核心的每个gene_array元素返回一个错误(这些元素的索引之间有mc.cores的间隔). 我昨天在与Simon Urbanek的一次交流中刷新了对此的理解: A close reading of mclapply's manpage reveals that this behavior is by design and it arises as result of interaction between: (a) "the input X is split into as many parts as there are cores (currently
the values are spread across the cores sequentially, i.e. first value
to core 1, second to core 2, ... (core + 1)-th value to core 1 etc.)
and then one process is forked to each core and the results are
collected." (b) a "try-error" object will be returned for all the values involved in
the failure, even if not all of them failed. In your case, by virtue of (a), your gene_array is spread "round-robin" style across the cores (with a gap of mc.cores between the indexes of successive elements), and by virtue of (b), if any gene_array element raises an error, you get back an error for each gene_array element sent to that core (having a gap of mc.cores between the indices of those elements). I refreshed my understanding of this in an exchange yesterday with Simon Urbanek: https://stat.ethz.ch/pipermail/r-sig-hpc/2019-September/002098.html in which I also provide an error-handling approach yielding errors only for the indices that generate an error. You can also get errors only for the indices that generate an error by passing 这篇关于mclapply遇到错误,具体取决于核心ID?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
mc.preschedule=FALSE
.