如何过滤R或Excel中的数据? [英] How to filter data in R or excel?

查看:175
本文介绍了如何过滤R或Excel中的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

 类别Var1 Var2 
1 20 27
2 13
2
2 11
2
1 11 27
2 61
2
2
2
1 20 27
2 30 71
2
2



<所有Class = 2都是属于家庭成员(Class = 1)的个人。从五月的例子来看,这个数据属于一个家庭

 类别Var1 Var2 
1 20 27
2 13
2
2 11
2

我想做的是删除个人层面的数据,并将其替换为家庭层面的值。所以从我的例子中,输出应该是,

 类别Var1 Var2 
1 20 27
2 20 27
2 20 27
2 20 27
2 20 27

是否有一个简单的方法来做到这一点在Excel或R?我有200k +的数据行,所以手动这样做会永远带我。

我不知道如何删除个人级别的值。但是一旦可以完成,我可以使用excel转到> Special> Blanks函数。

感谢
Dixi


<我正在阅读你的数据 -

  df<  -  read.table(textConnection(
ClassObj Var1 Var2
1 20 27
2 NA 13
2 NA NA
2 11 NA
2 NA NA
1 11 27
2 NA 71
2 NA
2 NA NA
),header = TRUE)



$ b

然后如下处理 -

  library(data.table)
dt < - data.table(df)

#将每个1和连续的2s唯一地组合起来
dt [,flag:= 0]
dt [ClassObj == 1,flag:= 1]
dt [,flag:= cumsum(flag)]

复制classobj = 1的值到其他具有相同标志的行
dt [,Var1:= .SD [ClassObj == 1,Var1],by =flag]
dt [,Var2:= .SD [ClassObj = = 1,Var2] ,by =flag]

获得输出 - $ /

 > dt 
ClassObj Var1 Var2 flag
1:1 20 27 1
2:2 20 27 1
3:2 20 27 1
4:2 20 27 1
5:2 20 27 1
6:1 11 27 2
7:2 11 27 2
8:2 11 27 2
9:2 11 27 2


I have a data that looks like below in excel 2007.

Class   Var1    Var2
1       20      27
2               13
2        
2       11   
2        
1       11      27
2               61
2        
2        
2        
1       20      27
2       30      71
2       
2       

All Class=2 are individuals that are a member of the household (Class=1) directly above it. From may example, this data belongs to one household

Class   Var1    Var2
1       20      27
2               13
2        
2       11   
2        

What I would like to do is remove the data in the individual level and replace it with value of the household level. So from my example, the output should be,

Class   Var1    Var2
1       20      27
2       20      27
2       20      27
2       20      27   
2       20      27  

Is there an easy way to do this in excel or in R? I have 200k+ rows of data so doing it manually would take me forever.

I am not sure how to remove the values in the individual level. But once that can be done, I can just use excels Go to > Special > Blanks functions.

Thanks Dixi

解决方案

I'm reading in your data as under -

df <- read.table(textConnection(
"ClassObj   Var1    Var2
1       20      27
2       NA       13
2       NA  NA
2       11   NA
2        NA NA
1       11      27
2       NA      71
2       NA NA
2       NA NA
"), header = TRUE)

And then processing it as below -

library(data.table)
dt <- data.table(df)

#Flagging each group of 1 and successive 2s uniquely
dt[,flag := 0]
dt[ClassObj == 1,flag := 1]
dt[,flag := cumsum(flag)]

#Copying down the value of classobj = 1 to all other rows with the same flag
dt[,Var1 := .SD[ClassObj == 1, Var1], by = "flag"]
dt[,Var2 := .SD[ClassObj == 1, Var2], by = "flag"]

To get the output -

> dt
   ClassObj Var1 Var2 flag
1:        1   20   27    1
2:        2   20   27    1
3:        2   20   27    1
4:        2   20   27    1
5:        2   20   27    1
6:        1   11   27    2
7:        2   11   27    2
8:        2   11   27    2
9:        2   11   27    2

这篇关于如何过滤R或Excel中的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆