chisq.test错误消息 [英] chisq.test Error Message

查看:1127
本文介绍了chisq.test错误消息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我遇到的一个问题:

Here's a problem I'm encountering:

示例数据

df <- data.frame(1,2,3,4,5,6,7,8)
df <- rbind(df,df,df,df)

我想做的是找到1,2,3对4的chisq.test的p.value,在第一行中定义的data.frame中的5,6。

What I would like to do is find the p.value for the chisq.test of 1,2,3 vs. 4,5,6 in the data.frame defined above in the first row.

让我们一起尝试一下:

chisq.test(c(1,2,3),c(4,5,6))$p.value ## this works.

但是当我尝试通过调用列/行...

But when I try to do it by calling the columns/rows...

chisq.test(df[1,1:3],df[1,4:6])$p.value

提供:complete.cases(x,y)中的错误:并不是所有参数的长度相同

Gives: Error in complete.cases(x, y) : not all arguments have the same length

有趣的是,因为这似乎不是真的:

Interesting, because that doesn't seem to be true:

length(df[1,1:3])
length(df[1,4:6])

关于如何更改符号以获得所需结果的任何想法?

Any thoughts on how to change the notation to get the desired result?

推荐答案

?chisq.test 告诉我们:

Arguments:

       x: a numeric vector or matrix. ‘x’ and ‘y’ can also both be
          factors.

       y: a numeric vector; ignored if ‘x’ is a matrix.  If ‘x’ is a
          factor, ‘y’ should be a factor of the same length.

如果我们查看 df Q,您定义的子集是:

If we look at df as per your Q, the subsets you define are:

> is.numeric(df[1,1:3])
[1] FALSE
> is.vector(df[1,1:3])
[1] FALSE
> is.matrix(df[1,1:3])
[1] FALSE

和您的其他子集相同。那么在上帝的腿上呢会发生什么呢?内部发生的是,由于 df [1,1:3] 是数据帧,它首先转换为一个列矩阵,然后转换为向量: p>

and the same for your other subset. What happens then is in the lap of the God's. What happens internally is that as df[1,1:3] is a data frame, it is converted first to a one column matrix, and thence to a vector:

Browse[2]> x ## here x is df[1,1:3]
[1] 1 2 3

df [1,4:6] y chisq中。测试函数)保持不变:

whilst df[1,4:6] (y in the chisq.test function) is left untouched:

Browse[2]> y
  X4 X5 X6
1  4  5  6

当代码调用 complete.cases(x,y),我们收到您报告的错误:

and when the code calls complete.cases(x,y), we get the error you report:

Browse[2]> complete.cases(x, y)
Error in complete.cases(x, y) : not all arguments have the same length

complete.cases 调用内部代码,所以我们看不到发生了什么,但基本上R认为 x y 的长度不一样,这是因为它们的类型不同。

complete.cases calls internal code so we can't see what is going on, but essentially R thinks x and y are not of the same length and this is because they are of different types.

@Prasad提供了一个工作,即将您向 chisq.test 提供的2个数据框列入向量。

@Prasad provides a work around, namely unlisting the 2 data frames you supply to chisq.test into vectors.

但是,使用这个功能的方式至少对我来说并不重要。人们通常将数据存储在列中,而不是数据帧的行中。它可能不会有差异,但数据框的列是其组件,如列表的组件。每个单独的组件(列)是离散实体,数据帧中/ n /观察点上的数据向量。如果我们将您的 df (并转回到数据框),以反映更自然的数据设置:

However, the way you are using the function doesn't make much sense, to me at least. One would normally store the data in columns, rather than rows of a data frame. It might not appear like there is a difference, but the columns of the data frame are its components, like the components of a list. Each individual component (column) is a discrete entity, a vector of data on the /n/ observations in the data frame. If we transpose your df (and cast back to a data frame) to reflect a more natural data set-up:

> df2 <- data.frame(t(df))

那么我们可以使用你做的方法,但索引 df2 的第一列的单独行(而不是第一行 df 中的单独列)在 chisq.test 中调用:

then we can use the approach you did, but index the separate rows of the first column of df2 (rather than the separate columns of the first row of df) in the chisq.test call:

> chisq.test(df2[1:3,1], df2[4:6,1])

    Pearson's Chi-squared test

data:  df2[1:3, 1] and df2[4:6, 1] 
X-squared = 6, df = 4, p-value = 0.1991

Warning message:
In chisq.test(df2[1:3, 1], df2[4:6, 1]) :
  Chi-squared approximation may be incorrect

这样做是因为R能够在两个子集中删除空维,所以两个输入都是相应长度的向量:

This works, because R is able to drop the empty dimension in both subsets, so both inputs are vectors of the appropriate length:

> df2[1:3,1] ## drops the empty dimension!
[1] 1 2 3
> is.vector(df2[1:3,1])
[1] TRUE

这篇关于chisq.test错误消息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆