在一行的第一个和最后一个观测值之间取差,其中每一行都是不同的 [英] Take difference between first and last observations in a row, where each row is different
问题描述
我的数据如下所示:
区域X2012 X2013 X2014 X2015 X2016 X20171 1 10 11 12 13 14 152 2不适用17 14不适用23不适用3 3 12 18 18不适用23不适用4 4不适用不适用15 28不适用385 5 14 18.5 16 27 25 396 6 15不适用17 27.5不适用39
这里的数字无关紧要,但是我想做的是取每一行中最早观察点和最新观察点之间的差值,为不同之处创建新的列,其中:
区域差异1(15-10)= 52(23-17)= 6
等等,实际上不是显示减法,而是最终结果.理想情况下,我只是从2012列中减去2017列,但是由于任何行的第一个观察值都可以在任何列开始,也可以在任何列结束,因此我不确定如何进行区别.
dplyr解决方案将是理想的选择,但任何解决方案都值得赞赏.定义一个函数,该函数将向量的最后一个减号减去它的矢量参数的第一个元素,并忽略NA,并将其应用于每一行.
lastMinusFirst<-函数(x,y = na.omit(x))tail(y,1)-y [1]transform(DF,diff = apply(DF [-1],1,lastMinusFirst))
给予:
区域X2012 X2013 X2014 X2015 X2016 X2017差异1 1 10 11.0 12 13.0 14 15 52 2不适用17.0 14不适用23不适用63 3 12 18.0 18不适用23不适用114 4不适用不适用15 28.0不适用38 235 5 14 18.5 16 27.0 25 39 256 6 15不适用17 27.5不适用39 24
注意
可复制形式的输入:
行<-区域X2012 X2013 X2014 X2015 X2016 X20171 1 10 11 12 13 14 152 2不适用17 14不适用23不适用3 3 12 18 18不适用23不适用4 4不适用不适用15 28不适用385 5 14 18.5 16 27 25 396 6 NA NA NA NA NA NA NA NA"DF<-read.table(文本=行)
更新
已修复.
I have data that looks like the following:
Region X2012 X2013 X2014 X2015 X2016 X2017
1 1 10 11 12 13 14 15
2 2 NA 17 14 NA 23 NA
3 3 12 18 18 NA 23 NA
4 4 NA NA 15 28 NA 38
5 5 14 18.5 16 27 25 39
6 6 15 NA 17 27.5 NA 39
The numbers are irrelevant here but what I am trying to do is take the difference between the earliest and latest observed points in each row to make a new column for the difference where:
Region Diff
1 (15 - 10) = 5
2 (23 - 17) = 6
and so on, not actually showing the subtraction but the final result. Ideally i would just subtract the 2017 column from the 2012 column but since any row's first observationcould start at any column and also end at any column I am unsure of how to take the difference.
A dplyr solution would be ideal but any solution at all is appreciated.
Define a function which takes the last minus the first element of its vector argument omitting NAs and apply it to each row.
lastMinusFirst <- function(x, y = na.omit(x)) tail(y, 1) - y[1]
transform(DF, diff = apply(DF[-1], 1, lastMinusFirst))
giving:
Region X2012 X2013 X2014 X2015 X2016 X2017 diff
1 1 10 11.0 12 13.0 14 15 5
2 2 NA 17.0 14 NA 23 NA 6
3 3 12 18.0 18 NA 23 NA 11
4 4 NA NA 15 28.0 NA 38 23
5 5 14 18.5 16 27.0 25 39 25
6 6 15 NA 17 27.5 NA 39 24
Note
The input in reproducible form:
Lines <- "Region X2012 X2013 X2014 X2015 X2016 X2017
1 1 10 11 12 13 14 15
2 2 NA 17 14 NA 23 NA
3 3 12 18 18 NA 23 NA
4 4 NA NA 15 28 NA 38
5 5 14 18.5 16 27 25 39
6 6 NA NA NA NA NA NA"
DF <- read.table(text = Lines)
Update
Fixed.
这篇关于在一行的第一个和最后一个观测值之间取差,其中每一行都是不同的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!