R Neuronet软件包对于数百万条记录太慢 [英] R neuralnet package too slow for millions of records

查看:116
本文介绍了R Neuronet软件包对于数百万条记录太慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用R包神经网络训练用于预测流失的神经网络.这是代码:

I am trying to train a neural network for churn prediction with R package neuralnet. Here is the code:

data <- read.csv('C:/PredictChurn.csv') 
maxs <- apply(data, 2, max) 
mins <- apply(data, 2, min)
scaled_temp <- as.data.frame(scale(data, center = mins, scale = maxs - mins))
scaled <- data
scaled[, -c(1)] <- scaled_temp[, -c(1)]
index <- sample(1:nrow(data),round(0.75*nrow(data)))
train_ <- scaled[index,]
test_ <- scaled[-index,]
library(neuralnet)
n <- names(train_[, -c(1)])
f <- as.formula(paste("CHURNED_F ~", paste(n[!n %in% "CHURNED_F"], collapse = " + ")))
nn <- neuralnet(f,data=train_,hidden=c(5),linear.output=F)

它可以正常工作,但是当使用完整的数据集(数百万行的范围)进行训练时,它花费的时间太长.所以我知道R默认情况下是单线程的,所以我尝试研究如何将工作并行化到所有内核中.甚至可以并行执行此功能吗?我尝试过各种包装,但都没有成功.

It works as it should, however when training with the full data set (in the range of millions of rows) it just takes too long. So I know R is by default single threaded, so I have tried researching on how to parallelize the work into all the cores. Is it even possible to make this function in parallel? I have tried various packages with no success.

有人能做到这一点吗? 它不一定是Neuronet软件包,任何可以让我训练神经网络的解决方案都可以使用.

Has anyone been able to do this? It doesn't have to be the neuralnet package, any solution that lets me train a neural network would work.

谢谢

推荐答案

我对

I have had good experiences with the package Rmpi, and it may be applicable in your case too.

library(Rmpi)

简而言之,其用法如下:

Briefly, its usage is as follows:

nproc = 4  # could be automatically determined
# Specify one master and nproc-1 slaves
Rmpi:: mpi.spawn.Rslaves(nslaves=nproc-1)
# Execute function "func_to_be_parallelized" on multiple CPUs; pass two variables to function
my_fast_results = Rmpi::mpi.parLapply(var1_passed_to_func,
                                      func_to_be_parallelized,
                                      var2_passed_to_func)
# Close slaves
Rmpi::mpi.close.Rslaves(dellog=T)

这篇关于R Neuronet软件包对于数百万条记录太慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆