如何将特定于列的参数传递给data.table .SD中的lapply? [英] How do I pass column-specific arguments to lapply in data.table .SD?

查看：87 发布时间：2020/4/27 5:12:29 r data.table parameter-passing lapply mapply

本文介绍了如何将特定于列的参数传递给data.table .SD中的lapply?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我看过在data.table中将.SD与lapply一起使用的示例，其功能如下:

I have seen examples of using .SDwith lapply in data.table with a simple function as below:

DT[ , .(b,d,e) := lapply(.SD, tan), .SDcols = .(b,d,e)]

但是我不确定如何在多参数函数中使用特定于列的参数.例如，我有一个winsorize函数，我想将其应用于数据表中列的子集，但要使用列特定的百分位数，例如

But I'm unsure of how to use column-specific arguments in a multiple argument function. For instance I have a winsorize function, I want to apply it to a subset of columns in a data table but using column-specific percentiles, e.g.

library(DescTools)
wlevel <- list(b=list(lower=0.01,upper=0.99), c=list(upper=0.02,upper=0.95))
DT[ , .(b,c) :=lapply(.SD, function(x) 
{winsorize(x,wlevel$zzz$lower,wlevel$zzz$upper)}), .SDcols = .(b,c)]

其中zzz将是要迭代的相应列.我还看到了有关在lapply上使用更改参数的线程，但在.SDcols

Where zzz will be the respective column to iterate. I have also seen threads on using changing arguments with lapply but not in the context of data table with .SDcols

这可能吗?

这是一个玩具示例，旨在概括任意数量的列的情况；循环始终是一种选择，但尝试查看是否有更优雅/更有效的解决方案...

This is a toy example, looking to generalize for the case of arbitrary large number of columns; Looping is always an option but trying to see if there's a more elegant/efficient solution...

推荐答案

如何在多参数函数中使用特定于列的参数?

How to use column-specific arguments in a multiple argument function?

使用 mapply(FUN, dat, params1, params2, ...) 其中每个params1, params2, ...可以是列表或向量； mapply并行遍历每个dat, params1, params2, ....

Use mapply(FUN, dat, params1, params2, ...) where each of params1, params2, ... can be a list or vector; mapply iterates over each of dat, params1, params2, ... in parallel.

请注意，与apply/lapply/sapply系列的其余部分不同，使用mapply时，函数参数首先出现，然后是数据和参数.

Note that unlike the rest of the apply/lapply/sapply family, with mapply the function argument comes first, then the data and parameter(s).

在您的情况下(伪代码，您需要对其进行调整才能使其运行)，例如:

In your case (pseudo-code, you'll need to tweak it to get it to run) something like:

与其嵌套列表wlevel <- list(b=list(lower=0.01,upper=0.99), c=list(upper=0.02,upper=0.95))相比，解压缩到以下位置可能更容易:

Instead of your nested list wlevel <- list(b=list(lower=0.01,upper=0.99), c=list(upper=0.02,upper=0.95)), probably easier to unpack to:

w_lower <- list(b=0.01, c=0.02)
w_upper <- list(b=0.99, c=0.95) 

DT[ , c('b','c') := mapply(function(x, w_lower_col, w_upper_col) { winsorize(x, w_lower_col, w_upper_col) },
  .SD, w_lower, w_upper), .SDcols = c('b', 'c')]

在为列表建立索引时，我们不需要使用列名(您的zzz)，mapply()应该仅按原样遍历列表.

We shouldn't need to use column-names (your zzz) in indexing into the list, mapply() should just iterate over the list as-is.

这篇关于如何将特定于列的参数传递给data.table .SD中的lapply?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何将特定于列的参数传递给data.table .SD中的lapply? [英] How do I pass column-specific arguments to lapply in data.table .SD?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何将特定于列的参数传递给data.table .SD中的lapply? [英] How do I pass column-specific arguments to lapply in data.table .SD?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭