根据组/类别执行多个配对的t检验 [英] Perform multiple paired t-tests based on groups/categories

查看:306
本文介绍了根据组/类别执行多个配对的t检验的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我被困在Rstudio中针对多个类别执行t.tests.我想获得每种产品类型的t.test的结果,比较在线和离线价格.我有800多种产品类型,所以这就是为什么不想为每个产品组手动进行操作.

I am stuck at performing t.tests for multiple categories in Rstudio. I want to have the results of the t.test of each product type, comparing the online and offline prices. I have over 800 product types so that's why don't want to do it manually for each product group.

我有一个数据框(超过200万行),命名为data,看起来像这样:

I have a dataframe (more than 2 million rows) named data that looks like:

> Product_type   Price_Online   Price_Offline   
1   A            48             37
2   B            29             22
3   B            32             40
4   A            38             36
5   C            32             27
6   C            31             35
7   C            28             24
8   A            47             42
9   C            40             36

理想情况下,我希望R将t.test的结果写入另一个称为product_types的数据帧:

Ideally I want R to write the result of the t.test to another data frame called product_types:

    > Product_type   
    1   A           
    2   B            
    3   C          
    4   D          
    5   E         
    6   F            
    7   G            
    8   H            
    9   I            
   800 ...

成为:

> Product_type   t         df       p-value   interval    mean of difference            
    1   A           
    2   B            
    3   C          
    4   D          
    5   E         
    6   F            
    7   G            
    8   H            
    9   I            
   800 ...

这是公式,如果我所有产品类型都位于不同的数据框中:

This is the formula if I had all product types in different dataframes:

t.test(Product_A$Price_Online, Product_A$Price_Offline, mu=0, alt="two.sided", paired = TRUE, conf.level = 0.99)

必须有一种更简单的方法来执行此操作.否则,我需要制作800多个数据帧,然后执行t检验800次.

There must be an easier way to do this. Otherwise I need to make 800+ data frames and then perform the t test 800 times.

我尝试了使用列表&运气不好,但到目前为止它不起作用.我还在多个列上尝试了t-Test: https://sebastiansauer.github.io/multiple-t-tests-with- dplyr/

I tried things with lists & lapply but so far it doesn't work. I also tried t-Test on multiple columns: https://sebastiansauer.github.io/multiple-t-tests-with-dplyr/

但是,最后,他仍然手动插入了公&女性(对我来说超过800个类别).

However, at the end he is still manually inserting male & female (for me over 800 categories).

推荐答案

一种方法是使用by:

result <- by(data, data$Product_type, 
    function(x) t.test(x$Price_Online, x$Price_offline, mu=0, alt="two.sided", paired = TRUE, conf.level = 0.99))

唯一的缺点是,通过返回一个列表,如果要在数据框中显示结果,则必须对其进行转换:

The only drawback is that by returns a list, and if you want your results in a dataframe, you have to convert it:

df <- data.frame(t(matrix(unlist(result), nrow = 10)))

然后,您必须手动添加产品类型和列名:

You'll then have to add the product type and column names manually:

df$Product_type <- names(result)
names(df) <- names(result$A)

这篇关于根据组/类别执行多个配对的t检验的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆