测试数据帧是否是另一个数据帧的排序版本 [英] Test whether a dataframe is a sorted version of another dataframe

查看:145
本文介绍了测试数据帧是否是另一个数据帧的排序版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

测试某个数据帧是否只是另一个数据帧的排序版本是否可行?例如,如果我有两个数据帧 a b ,是否有一些方法可以轻松确定 a 只是 b 的重新排序版本(反之亦然)?



这里有一个简单的例子:

  a < -  data.frame(x1 = 1:10,x2 = 11: x3 = 1:2)
b < - a [order(a $ x3,a $ x1,decrease = TRUE),]

我能想到的最接近的事情是 all.equal ,但它的输出是没有帮助的(至少对我来说):

 > all.equal(a,b)
[1]属性:< Component 2:Mean relative difference:0.9545455>
[2]组件1:平均相对差异:0.9545455
[3]组件2:平均相对差异:0.3387097
[4]组件3:平均相对差异:0.6666667



我想象有一些明显的方法来做这个是暗示我。我正在寻找一个通用的解决方案,可以很好地扩展到许多变量和许多观察(因此上面的例子只是为了演示)。



另外:函数还将识别 a b 的子集(反之亦然)。

解决方案

我会探索比较包:

  library(compare)
compare(a,b,allowAll = TRUE)
#TRUE
#sorted

在这里,它显示它必须对数据进行排序,然后才能找到相同的数据。



  a < -  data.frame(x1 = (a,a [order(x3,x1,decrease = TRUE),]]]>其中,x 1,x 2, )
b $ x4< - as.character(b $ x4)
b < - b [c(4,1,3,2)]
/ pre>

以下是比较的结果:

  compare(a,b,allowAll = TRUE)
#TRUE
#reordered columns
#[x4]到< factor>
#sorted


Is it feasible to test whether some dataframe is simply a sorted version of another dataframe? For example, if I have two dataframes a and b, is there some way to easily determine whether a is simply a reordered version of b (or vice versa)?

Here's a trivial example:

a <- data.frame(x1=1:10, x2=11:20, x3=1:2)
b <- a[order(a$x3, a$x1, decreasing=TRUE),]

The closest thing I can think of is all.equal, but its output is not helpful (to me, at least):

> all.equal(a,b)
[1] "Attributes: < Component 2: Mean relative difference: 0.9545455 >"
[2] "Component 1: Mean relative difference: 0.9545455"                
[3] "Component 2: Mean relative difference: 0.3387097"                
[4] "Component 3: Mean relative difference: 0.6666667"

I imagine there is some obvious way to do this that is alluding me. I'm looking for a general solution that would scale well to many variables and many observations (thus the above example is simply for demonstration).

Also: Ideally, such a function would also identify whether a is a subset of b (or vice versa).

解决方案

I would explore the "compare" package:

library(compare)
compare(a, b, allowAll=TRUE)
# TRUE
#   sorted

Here, it shows that it had to sort the data before it found the data to be the same.

Here's a slightly more complicated example, with factors coerced to character, rows reordered, and columns reordered:

a <- data.frame(x1=1:10, x2=11:20, x3=1:2, x4 = letters[1:10])
b <- with(a, a[order(x3, x1, decreasing=TRUE), ])
b$x4 <- as.character(b$x4)
b <- b[c(4, 1, 3, 2)]

Here's the result of compare:

compare(a, b, allowAll=TRUE)
# TRUE
#   reordered columns
#   [x4] coerced from <character> to <factor>
#   sorted

这篇关于测试数据帧是否是另一个数据帧的排序版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆