用条件替换数据帧中的行 [英] Replace the rows in dataframe with condition

查看:106
本文介绍了用条件替换数据帧中的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您对此有任何疑问:
[使用向量动态替换数据框中的行

我有一个数据框架,例如:

I have a data.frame for example:

d <- read.table(text='   V1 V2  V3  V4  V5  V6  V7
1 1 a 2 3 4 9 6
2 1 b 2 2 4 5 NA
3 1 c 1 3 4 5 8
4 1 d 1 2 3 6 9
5 2 a 1 2 3 4 5
6 2 b 1 4 5 6 7
7 2 c 1 2 3 5 8
8 2 d 2 3 6 7 9', header=TRUE)

现在我想要一行,例如第一个(1a)和:

Now I want to take one row, for example the first one (1a) and:

获取最小值和最大值。在这种情况下,min = 2和max = 9(注意在这两个之间缺少值,例如,该行中没有5,7或8)。

Get the min and max value from that row. In this case min=2 and max=9 (note there are missing values in between for example there is no 5, 7, or 8 in that row).

现在我想用所有缺少的值替换这行,并将其延伸(该行将比所有其他行更长,因为它将从2到9(2,3,4,5,6,7,8,9)。整个然后,数据框应该被NA列自动扩展,而不是我所替换的行的其他行。

Now I want to replace that row with all missing values and extend it (the row will be longer than all others as it will go from 2 until 9 (2,3,4,5,6,7,8,9). The whole data.frame should then be automatically extended by NA columns for the other rows that are not as long as the one I replaced.

现在下面的代码实现了这一点: / p>

Now the following code does achieve this:

row.to.change <- 1
(new.row <- seq(min(d[row.to.change,c(-1, -2)], na.rm=TRUE), max(d[row.to.change,c(-1,-2)], na.rm=TRUE)))
(num.add <- length(new.row) - ncol(d) + 2)
# [1] 3
if (num.add > 0) {
  d <- cbind(d, replicate(num.add, rep(NA, nrow(d))))
} else if (num.add <= 0) {
  new.row <- c(new.row, rep(NA, -num.add))
}

,最后将扩展的data.frame标头重命名为defaul t:

and finally renames the extended data.frame headers as the default ones:

d[row.to.change,c(-1, -2)] <- new.row
colnames(d) <- paste0("V", seq_len(ncol(d)))

现在:这对于在row.to.replace中指定的行是有效的,但是如果我想让它适用于在第二列中具有b的所有行,那么它是如何工作的?这样做:做这个d $ V2 =='b'?万一data.frame是5000行长。

Now: This does work for the row that I specify in: row.to.replace but how does this work, if for example I want it to work for all rows which have a 'b' in the second column? Something like: "do this where d$V2 == 'b'"? In case the data.frame is 5000 rows long.

推荐答案

你已经解决了。只需做一个函数,然后将其应用到数据的每一行。

You have already solved. Just make a function and then apply it to each row of your data.

rtc=function(row.to.change){# <- 1
(new.row <- seq(min(d[row.to.change,c(-1, -2)], na.rm=TRUE), max(d[row.to.change,c(-1,-2)], na.rm=TRUE)))
(num.add <- length(new.row) - ncol(d) + 2)
# [1] 3
if (num.add <= 0) {
  new.row <- c(new.row, rep(NA, -num.add))
}
new.row
}

#d2=d

newr=lapply(1:nrow(d),rtc) # for the hole data
# for specific condition, like lines with "b" in V2 change to:
# newr=lapply(1:nrow(d),function(z)if(d$V2[z]=="b")rtc(z) else as.numeric(d[z,c(-1, -2)])) 
mxl=max(sapply(newr,length))
newr=lapply(newr,function(z)if(length(z)<mxl)c(z,rep(NA,mxl-length(z))) else z)
if (ncol(d)-2 < mxl) {
  d <- cbind(d, replicate(mxl-ncol(d)+2, rep(NA, nrow(d))))
}
d[,c(-1, -2)] <- do.call(rbind,newr)
colnames(d) <- paste0("V", seq_len(ncol(d)))

d

  V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
1  1  a  2  3  4  5  6  7  8   9  NA
2  1  b  2  3  4  5 NA NA NA  NA  NA
3  1  c  1  2  3  4  5  6  7   8  NA
4  1  d  1  2  3  4  5  6  7   8   9
5  2  a  1  2  3  4  5 NA NA  NA  NA
6  2  b  1  2  3  4  5  6  7  NA  NA
7  2  c  1  2  3  4  5  6  7   8  NA
8  2  d  2  3  4  5  6  7  8   9  NA

这篇关于用条件替换数据帧中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆