purrr 将 t.test 映射到拆分的 df [英] purrr map a t.test onto a split df

查看：47 发布时间：2021/6/23 19:07:30 r purrr

本文介绍了purrr 将 t.test 映射到拆分的 df的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是 purrr 的新手，Hadley 很有前途的函数式编程 R 库.我正在尝试采用分组和拆分的数据帧并对变量运行 t 检验.使用示例数据集的示例可能如下所示.

I'm new to purrr, Hadley's promising functional programming R library. I'm trying to take a grouped and split dataframe and run a t-test on a variable. An example using a sample dataset might look like this.

mtcars %>% 
  dplyr::select(cyl, mpg) %>% 
  group_by(as.character(cyl)) %>% 
  split(.$cyl) %>% 
  map(~ t.test(.$`4`$mpg, .$`6`$mpg))

这会导致以下错误:

Error in var(x) : 'x' is NULL
In addition: Warning messages:
1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
2: In mean.default(x) : argument is not numeric or logical: returning NA

我是不是误解了 map 的工作原理?或者有更好的方法来思考这个问题吗?

Am I just misunderstanding how map works? Or is there a better way to think about this?

推荐答案

特别是在处理需要多个输入的管道时(我们这里没有 Haskell 的箭头)，我发现首先通过类型/签名来推理更容易，然后将逻辑封装在函数中(您可以对其进行单元测试)，然后编写一个简洁的链.

Especially when dealing with pipes that require multiple inputs (we don't have Haskell's Arrows here), I find it easier to reason by types/signatures first, then encapsulate logic in functions (which you can unit test), then write a concise chain.

在这种情况下，您想比较所有可能的向量对，所以我会设定一个目标，即编写一个函数，该函数接受一对(即 2 个)向量并返回它们的 2 路 t.test.

In this case you want to compare all possible pairs of vectors, so I would set a goal of writing a function that takes a pair (i.e. a list of 2) of vectors and returns the 2-way t.test of them.

完成此操作后，您只需要一些胶水.所以计划是:

Once you've done this, you just need some glue. So the plan is:

编写接受向量列表并执行 2 向 t 检验的函数.
编写一个从 mtcars 中获取向量的函数/管道(简单).
将上述内容映射到对列表上.

在编写任何代码之前制定这个计划很重要.由于 R 不是强类型的事实，事情在某种程度上变得模糊不清，但通过这种方式，您可以首先推理类型"，然后是实现.

It's important to have this plan before writing any code. Things are somehow obfuscated by the fact that R is not strongly typed, but this way you reason about "types" first, implementation second.

t.test 接受点，所以我们使用 purrr:lift 让它接受一个列表.由于我们不想匹配列表元素的名称，我们使用 .unnamed = TRUE.此外，我们还特别明确地说明了我们正在使用元数为 2 的 t.test 函数(尽管代码不需要执行此额外步骤).

t.test takes dots, so we use purrr:lift to have it take a list. Since we don't want to match on the names of the elements of the list, we use .unnamed = TRUE. Also we make it extra clear we're using the t.test function with arity of 2 (though this extra step is not needed for the code to work).

t.test2 <- function(x, y) t.test(x, y)
liftedTT <- lift(t.test2, .unnamed = TRUE)

步骤 2

将我们在步骤 1 中得到的函数包装成一个函数链，它采用简单的对(这里我使用索引，使用 cyl 因子级别应该很容易，但我没有时间弄清楚).

Step 2

Wrap the function we got in step 1 into a functional chain that takes a simple pair (here I use indexes, it should be easy to use cyl factor levels, but I don't have time to figure it out).

doTT <- function(pair) {
  mtcars %>%
    split(as.character(.$cyl)) %>%
    map(~ select(., mpg)) %>% 
    extract(pair) %>% 
    liftedTT %>% 
    broom::tidy
}

步骤 3

既然我们已经准备好了所有的乐高积木，构图就变得微不足道了.

Step 3

Now that we have all our lego pieces ready, composition is trivial.

1:length(unique(mtcars$cyl)) %>% 
  combn(2) %>% 
  as.data.frame %>% 
  as.list %>% 
  map(~ doTT(.))

$V1
  estimate estimate1 estimate2 statistic      p.value parameter conf.low conf.high
1 6.920779  26.66364  19.74286  4.719059 0.0004048495  12.95598 3.751376  10.09018

$V2
  estimate estimate1 estimate2 statistic      p.value parameter conf.low conf.high
1 11.56364  26.66364      15.1  7.596664 1.641348e-06  14.96675 8.318518  14.80876

$V3
  estimate estimate1 estimate2 statistic      p.value parameter conf.low conf.high
1 4.642857  19.74286      15.1  5.291135 4.540355e-05  18.50248 2.802925  6.482789

<小时>

这里有很多东西需要清理，主要是使用因子级别并将它们保留在输出中(而不是在第二个函数中使用全局变量)，但我认为您想要的核心就在这里.根据我的经验，不迷路的诀窍是从内到外工作.

There's quite a bit here to clean up, mainly using factor levels and preserving them in the output (and not using globals in the second function) but I think the core of what you wanted is here. The trick not to get lost, in my experience, is to work from the inside out.

这篇关于purrr 将 t.test 映射到拆分的 df的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

purrr 将 t.test 映射到拆分的 df [英] purrr map a t.test onto a split df

问题描述

推荐答案

步骤 2

Step 2

步骤 3

Step 3

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

purrr 将 t.test 映射到拆分的 df [英] purrr map a t.test onto a split df

问题描述

推荐答案

步骤 2

Step 2

步骤 3

Step 3

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭