当多行/多列组合不符合要求时更改值 [英] Changing value when multiple rows/columns combined do not meet a requirement

查看:69
本文介绍了当多行/多列组合不符合要求时更改值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R相对较新,它正在处理具有数百万行的项目,所以我举了这个例子:
我有一个包含三行不同数据的矩阵.
如果[,1] [,2] [Farm]的组合总共少于两个观测值,则该行的[Farm]值将更改为q99999.这样,它们就属于同一组,以供以后分析.

Relatively new to R, working on a project with millions of rows so I made this example:
I've got a matrix with three different rows of data.
If the combination of [,1][,2][Farm] has less then two observations in total, the [Farm] value of that row gets changed to q99999. This way they fall in the same group for later analysis.

    A <- matrix(c(1,1,2,3,4,5,5), ncol = 7)
    B <- matrix(c(T,T,F,T,F,T,T), ncol = 7)
    C <- matrix(c("Req","Req","Req","fd","as","f","bla"), ncol = 7)
    AB <- rbind.fill.matrix(A,B, C)
    AB <-t(AB)
    colnames(AB) <- c("Col1", "Col2", "Farm")
    format(AB)

     Col1  Col2  Farm
    1 "1  " "1  " "Req"
    2 "1  " "1  " "Req"
    3 "2  " "0  " "Req"
    4 "3  " "1  " "fd "
    5 "4  " "0  " "as "
    6 "5  " "1  " "f  "
    7 "5  " "1  " "bla"

所以预期结果如下:

     Col1  Col2  Farm
    1 "1  " "1  " "Req"
    2 "1  " "1  " "Req"
    3 "2  " "0  " "q99999"
    4 "3  " "1  " "q99999"
    5 "4  " "0  " "q99999"
    6 "5  " "1  " "q99999"
    7 "5  " "1  " "q99999"

现在"Farm"列有两个组,"Req"和"q99999"

Now there is two groups for the column "Farm", "Req" and "q99999"

在保持性能尽可能快的情况下,R中最好的方法是做到这一点?

What would be the best way in R to get this done while keeping performance as quick as possible?

推荐答案

使用data.table包的可能解决方案:

A possible solution using data.table package:

library(data.table)

as.data.table(AB)[,Farm:=ifelse(.N>1, Farm, "q99999"),.(Col1, Col2, Farm)][]

#   Col1 Col2   Farm
#1:    1    1    Req
#2:    1    1    Req
#3:    2    0 q99999
#4:    3    1 q99999
#5:    4    0 q99999
#6:    5    1 q99999
#7:    5    1 q99999

或以ave为基础的R:

AB[,'Farm'] = ave(AB[,'Farm'], do.call(c,apply(AB,2,list)), FUN=function(x) ifelse(length(x)==1, 'q99999',x))

#  Col1 Col2 Farm    
#1 "1"  "1"  "Req"   
#2 "1"  "1"  "Req"   
#3 "2"  "0"  "q99999"
#4 "3"  "1"  "q99999"
#5 "4"  "0"  "q99999"
#6 "5"  "1"  "q99999"
#7 "5"  "1"  "q99999"

这篇关于当多行/多列组合不符合要求时更改值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆