如何使R使用更多的CPU和内存? [英] How can I make R use more CPU and memory?

查看:1024
本文介绍了如何使R使用更多的CPU和内存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

无论R计算多么密集,它占用的CPU都不会超过25%。我尝试将 rsession.exe 的优先级设置为 High 甚至是 Realtime ,但用法保持不变。是否有任何方法可以提高R的CPU使用率来充分利用系统的潜力,或者我对问题的理解有误解吗?



P.S .:以下是CPU使用率的屏幕截图。

解决方案

Base R是单线程的,因此4核CPU的使用率预计为25%。在单台Windows计算机上,可以使用 parallel 软件包和 foreach 软件包将处理分布在群集(或核心)上。 b
$ b

首先,并行软件包(包含在R 2.8.0+中,无需安装)提供了基于snow软件包的功能-这些功能是的扩展lapply()。并且foreach包提供了for循环构造的扩展-请注意,它必须与 doParallel 包一起使用。



下面是一个使用这两个软件包的k均值聚类的快速示例。这个想法很简单,即(1)在每个群集中拟合 kmeans(),(2)结合结果,(3)最小抽取 tot .withiness

 库(并行)
库(迭代器)
库(foreach)
库(doParallel)

#并行
split = detectCores()
eachStart = 25

cl = makeCluster( split)
init = clusterEvalQ(cl,{library(MASS); NULL})
结果= parLapplyLB(cl
,rep(eachStart,split)
,function(nstart) kmeans(波士顿,4,nstart = nstart))
insidess = sapply(结果,函数(结果)result $ tot.withinss)
result = results [[which.min(withinss)]]
stopCluster(cl)

结果$ tot.withinss
#[1] 1814438

#foreach
split = detectCores()
eachStart = 25
#设置迭代器
iters = iter(rep(eachStart,split))
#设置合并函数
comb = function(res1,res2){
if(res1 $ tot.with inss< res2 $ tot.withinss)res1 else res2
}

cl = makeCluster(split)
registerDoParallel(cl)
结果= foreach(nstart = iters,.combine = comb,.packages = MASS)%dopar%
kmeans(波士顿,4,nstart = nstart)
stopCluster(cl)

result $ tot.withinss
#[1] 1814438

这些软件包的更多详细信息和更多示例,请参见以下帖子。




No matter how intensive the R computation is, it doesn't use more than 25% of the CPU. I have tried setting the priority of the rsession.exe to High and even Realtime but the usage remains the same. Is there any way to increase the CPU usage of R to utilize the full potential of my system or is there is any misunderstanding in my understanding of the problem? Thanks in advance for the help.

P.S.: Below is a screenshot of the CPU usage.

解决方案

Base R is single-threaded so that 25% of usage is expected on 4-core CPU. On a single Windows machine, it is possible to spread processing across clusters (or cores if you like) using either the parallel package and the foreach package.

First of all, the parallel package (included in R 2.8.0+, no need to install) provides functions based on the snow package - these functions are extensions of lapply(). And the foreach package provides an extension of for-loop construct - note that it has to be used with the doParallel package.

Below is a quick example of k-means clustering using both the packages. The idea is simple, which is (1) fitting kmeans() in each cluster, (2) combining the outcomes and (3) seleting minimum tot.withiness.

library(parallel)
library(iterators)
library(foreach)
library(doParallel)

# parallel
split = detectCores()
eachStart = 25

cl = makeCluster(split)
init = clusterEvalQ(cl, { library(MASS); NULL })
results = parLapplyLB(cl
                      ,rep(eachStart, split)
                      ,function(nstart) kmeans(Boston, 4, nstart=nstart))
withinss = sapply(results, function(result) result$tot.withinss)
result = results[[which.min(withinss)]]
stopCluster(cl)

result$tot.withinss
#[1] 1814438

# foreach
split = detectCores()
eachStart = 25
# set up iterators
iters = iter(rep(eachStart, split))
# set up combine function
comb = function(res1, res2) {
  if(res1$tot.withinss < res2$tot.withinss) res1 else res2
}

cl = makeCluster(split)
registerDoParallel(cl)
result = foreach(nstart=iters, .combine="comb", .packages="MASS") %dopar%
  kmeans(Boston, 4, nstart=nstart)
stopCluster(cl)

result$tot.withinss
#[1] 1814438

Further details of those packages and more examples can be found in the following posts.

这篇关于如何使R使用更多的CPU和内存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆