mcapply:所有调度的内核在用户代码中遇到错误 [英] mcapply: all scheduled cores encountered errors in user code
问题描述
以下是我的代码.我正在尝试获取以 .idat
结尾的所有文件(~20000)的列表,并使用 illuminaio::readIDAT
函数读取每个文件.
The following is my code. I am trying get the list of all the files (~20000) that end with .idat
and read each file using the function illuminaio::readIDAT
.
library(illuminaio)
library(parallel)
library(data.table)
# number of cores to use
ncores = 8
# this gets all the files with .idat extension ~20000 files
files <- list.files(path = './',
pattern = "*.idat",
full.names = TRUE)
# function to read the idat file and create a data.table of filename, and two more columns
# write out as csv using fwrite
get.chiptype <- function(x)
{
idat <- readIDAT(x)
res <- data.table(filename = x, nSNPs = nrow(idat$Quants), Chip = idat$ChipType)
fwrite(res, file.path = 'output.csv', append = TRUE)
}
# using mclapply call the function get.chiptype on all 20000 files.
# use 8 cores at a time
mclapply(files, FUN = function(x) get.chiptype(x), mc.cores = ncores)
在读取和写入有关 1200 个文件的信息后,我收到以下消息:
After reading and writing info about 1200 files, I get the following message:
Warning message:
In mclapply(files, FUN = function(x) get.chiptype(x), mc.cores = ncores) :
all scheduled cores encountered errors in user code
我该如何解决?
推荐答案
在某些情况下调用 mclapply()
需要您指定允许多个随机数流的随机数生成器.R 版本 2.14.0 实现了 Pierre L'Ecuyer 的多重伪随机数生成器.
Calling mclapply()
in some instances requires you to specify a random number generator that allows for multiple streams of random numbers.
R version 2.14.0 has an implementation of Pierre L'Ecuyer's multiple pseudo-random number generator.
尝试在 mclapply()
调用之前添加以下内容,并为my.seed
"预先指定值:
Try adding the following before the mclapply()
call, with a pre-specified value for 'my.seed
':
set.seed( my.seed, kind = "L'Ecuyer-CMRG" );
这篇关于mcapply:所有调度的内核在用户代码中遇到错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!