配对的t检验崩溃应用循环(已编辑) [英] paired t-test crashes apply-loop (edited)

查看：95 发布时间：2020/6/18 19:35:05 r hypothesis-test

本文介绍了配对的t检验崩溃应用循环(已编辑)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

为回应有用的评论，我编辑了原始问题(我假设for循环和apply循环给出不同的结果).

我正在使用R通过来自定界表的输入来运行大量的2组t检验.遵循这里和其他地方的建议，我尝试了"for-loops"和"apply"来实现这一目标.对于正常" t.test，两者都可以很好地工作并给出相同的结果.但是，对于配对的t检验，前瞻方法似乎起作用，而apply循环则无效.后来，我发现两个循环都遇到相同的问题(请参见下文)，但是for循环可以更好地处理这种情况(循环中只有一个循环返回无效结果)，而apply循环则完全失败. >

我的输入文件如下:(第一行是标题行，数据行具有名称，第1组为4个数据点，第2组为4个数据点):

header g1.1 g1.2 g1.3 g1.4 g2.1 g2.2 g2.3 g2.4
name1  0    0.5  -0.2 -0.2 -0.1 0.4 -0.3 -0.3
name2  23.2 24.4 24.5 27.2 15.5 16.5 17.7 20.0
name3  .....

，依此类推(总共〜50000行).第一行数据(以name19开头是罪魁祸首.

这是一个效果更好的for循环版本(虽然在有问题的行上出错，但可以正确处理所有其他行):

table <- read.table('ttest_in.txt',head=1,sep='\t')
for(i in 1:nrow(table)) {
   g1<-as.numeric((table)[i,2:5])
   g2<-as.numeric((table)[i,6:9])
   pv <- t.test(g1,g2,paired=TRUE)$p.value
}

这是会导致问题的应用"版本

table <- read.table('ttest_in.txt',head=1,sep='\t')
pv.list <- apply(table[,2:9],1,function(x){t.test(x[1:4],x[5:8],paired=TRUE)$p.value})

〜50000条数据线中的一条是有问题的，因为所有成对比较的差异都相同，这在成对的t检验中会导致不确定的p值(基本上为零). Apply循环崩溃，错误为数据本质上是恒定的".对我(作为R新手)来说，使整个脚本崩溃并不是一个好主意，因为t.test不喜欢一个数据.在for循环中，该数据行还会产生一条错误消息，但是循环继续进行，所有其他t检验都给出了正确的结果.

我从根本上做错了吗?这种行为a实质上禁止在这种批处理分析中使用Apply循环.还是有解决此问题的标准方法.为什么t检验不仅仅针对那个特定的p值返回无效值，而不是求救?

解决方案

在这种情况下，我会捕获所有警告和错误，然后进行调查，如下所示:如何告诉lapply忽略错误并处理列表中的下一件事?

In response to the helpful comments, I have edited the original question (where I had assumed that a for-loop and an apply-loop give different results).

I am using R to run a large number of 2-group t-tests, using input from a delimited table. Following recommendations from here and elsewhere, I tried either 'for-loops' and 'apply' to accomplish that. For 'normal' t.test, both work nicely and give the same results. However, for a paired t-test, the for-look appears to works while the apply-loop does not. Later, i found out that both loops suffer from the same problem (see below) but the for-loops deals more gracefully with the situation (only one cycle of the loop returns an invalid result) while the apply-loop fails altogether.

My input file looks like this: (the first line is a header line, the data lines have a name, 4 datapoints for group 1 and 4 datapoints for group 2):

header g1.1 g1.2 g1.3 g1.4 g2.1 g2.2 g2.3 g2.4
name1  0    0.5  -0.2 -0.2 -0.1 0.4 -0.3 -0.3
name2  23.2 24.4 24.5 27.2 15.5 16.5 17.7 20.0
name3  .....

and so on (overall ~50000 lines). The first data line (starting with name19 turned out to be the culprit.

This is the for-loop version that works better (failes on the problematic line but correctly deals with all other lines):

table <- read.table('ttest_in.txt',head=1,sep='\t')
for(i in 1:nrow(table)) {
   g1<-as.numeric((table)[i,2:5])
   g2<-as.numeric((table)[i,6:9])
   pv <- t.test(g1,g2,paired=TRUE)$p.value
}

This is the 'apply' version that causes problems

table <- read.table('ttest_in.txt',head=1,sep='\t')
pv.list <- apply(table[,2:9],1,function(x){t.test(x[1:4],x[5:8],paired=TRUE)$p.value})

One of the ~50000 data lines is problematic in that the differences of all pairwise comparions are identical, which in a paired t-test results in an undefined p-value (essentially zero). The apply loop crashes with the error 'data are essentially constant'. To me (as an R newbie) it does not seem to be a good idea to crash the entire script just because the t.test doesn't like one piece of data. In the for-loop, this data line also results in an error message but the loop continues and all the other t-tests give correct results.

Did I do something fundamentally wrong? This behaviour a essentially prohibits the usage of apply-loops for this kind of batch analysis. Or is there a standard way to circumvent this problem. Why doesn't the t-test just return something invalid for that particular p-value instead of bailing out?

解决方案

In situations like this, I catch all the warnings and errors and investigate them afterwards, as shown here: How do I save warnings and errors as output from a function?

You may also find some good ideas here: How to tell lapply to ignore an error and process the next thing in the list?

这篇关于配对的t检验崩溃应用循环(已编辑)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

配对的t检验崩溃应用循环(已编辑) [英] paired t-test crashes apply-loop (edited)

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

配对的t检验崩溃应用循环(已编辑) [英] paired t-test crashes apply-loop (edited)

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭