在R中的data.table内部运行函数 [英] Run a function inside data.table in R

查看:78
本文介绍了在R中的data.table内部运行函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中有一些data.table格式的数据,我需要运行一些功能。

Hi I have some data in data.table format in R and I need to run some function.

假设我有一个名为A的data.table,带有列

Let say I have a data.table called A with columns, "name" "height", "weight".

我想运行一些函数,即data.table中的线性回归并将系数RMSE存储到表中结果。

I want to run some function, i.e. linear regression within data.table and store the coefficients, RMSE into the table results.

A[, .(beta = lm(height ~ weight)$coefficients[2], RMSE = 
     as.numeric(sqrt(crossprod(lm(height 
     ~ weight)$residuals)/(length(lm(height ~ weight)$residuals)- 
     (length(coef(lm(height ~ weight)))-1)))*100),
     by=.(name)]

我的问题:是否有一种方法可以将lm(height〜weight)结果保存为一个对象,然后访问该对象的数据,因此data.table不需要在这里像4次一样运行lm函数?

My question: Is there a way to save the lm(height ~ weight) result as an object and then access this object's data so data.table don't need to run the lm function like 4 times in here?

这可以运行,但是与我使用foreach并循环遍历名称相比,它有点慢,因为我有数百万行数据。

This runs but it is a bit too slow compared to me using foreach and loop over "name" as I have millions rows of data.

谢谢。

推荐答案

通过使用Henrik建议的匿名正文,我可以加快流程!

By using anonymous body as suggested by Henrik, I am able to speed up the process!

A[, {model <- lm(height ~ weight)
       BETA <- model$coefficient[2]
       RMSE <- as.numeric(sqrt(crossprod(model$residuals)/(length(model$residuals)- 
               (length(coef(model))-1)))*100)

       list(BETA = BETA, RMSE = RMSE)
       },
 by = .(name)]

显然,匿名主体(lambda)不需要名称,就像一次运行就忘记了。在此lambda中, lm()函数运行一次(每个组),并将结果存储在对象中。

Apparently, an anonymous body (lambda) does not require a name and it is like "run once and forget". Inside this lambda, the lm() function is ran once (per group), and the result stored in an object.

然后我们可以从模型对象中提取所需的数据,最后提供 list()来让 j 将提取的数据转换为列。

We can then extract the required data from the model object and lastly list() is provided to let j convert the extracted data into columns.

非常感谢!

这篇关于在R中的data.table内部运行函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆