在一行的第一个和最后一个观测值之间取差,其中每一行都是不同的 [英] Take difference between first and last observations in a row, where each row is different

查看:24
本文介绍了在一行的第一个和最后一个观测值之间取差,其中每一行都是不同的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据如下所示:

 区域X2012 X2013 X2014 X2015 X2016 X20171 1 10 11 12 13 14 152 2不适用17 14不适用23不适用3 3 12 18 18不适用23不适用4 4不适用不适用15 28不适用385 5 14 18.5 16 27 25 396 6 15不适用17 27.5不适用39 

这里的数字无关紧要,但是我想做的是取每一行中最早观察点和最新观察点之间的差值,为不同之处创建新的列,其中:

 区域差异1(15-10)= 52(23-17)= 6 

等等,实际上不是显示减法,而是最终结果.理想情况下,我只是从2012列中减去2017列,但是由于任何行的第一个观察值都可以在任何列开始,也可以在任何列结束,因此我不确定如何进行区别.

dplyr解决方案将是理想的选择,但任何解决方案都值得赞赏.

解决方案

定义一个函数,该函数将向量的最后一个减号减去它的矢量参数的第一个元素,并忽略NA,并将其应用于每一行.

  lastMinusFirst<-函数(x,y = na.omit(x))tail(y,1)-y [1]transform(DF,diff = apply(DF [-1],1,lastMinusFirst)) 

给予:

 区域X2012 X2013 X2014 X2015 X2016 X2017差异1 1 10 11.0 12 13.0 14 15 52 2不适用17.0 14不适用23不适用63 3 12 18.0 18不适用23不适用114 4不适用不适用15 28.0不适用38 235 5 14 18.5 16 27.0 25 39 256 6 15不适用17 27.5不适用39 24 

注意

可复制形式的输入:

 行<-区域X2012 X2013 X2014 X2015 X2016 X20171 1 10 11 12 13 14 152 2不适用17 14不适用23不适用3 3 12 18 18不适用23不适用4 4不适用不适用15 28不适用385 5 14 18.5 16 27 25 396 6 NA NA NA NA NA NA NA NA"DF<-read.table(文本=行) 

更新

已修复.

I have data that looks like the following:

  Region X2012 X2013 X2014 X2015 X2016 X2017
1      1    10    11    12    13    14    15
2      2    NA    17    14    NA    23    NA
3      3    12    18    18    NA    23    NA
4      4    NA    NA    15    28    NA    38
5      5    14  18.5    16    27    25    39
6      6    15    NA    17  27.5    NA    39

The numbers are irrelevant here but what I am trying to do is take the difference between the earliest and latest observed points in each row to make a new column for the difference where:

Region              Diff
     1     (15 - 10) = 5
     2     (23 - 17) = 6

and so on, not actually showing the subtraction but the final result. Ideally i would just subtract the 2017 column from the 2012 column but since any row's first observationcould start at any column and also end at any column I am unsure of how to take the difference.

A dplyr solution would be ideal but any solution at all is appreciated.

解决方案

Define a function which takes the last minus the first element of its vector argument omitting NAs and apply it to each row.

lastMinusFirst <- function(x, y = na.omit(x)) tail(y, 1) - y[1]
transform(DF, diff = apply(DF[-1], 1, lastMinusFirst))

giving:

  Region X2012 X2013 X2014 X2015 X2016 X2017 diff
1      1    10  11.0    12  13.0    14    15    5
2      2    NA  17.0    14    NA    23    NA    6
3      3    12  18.0    18    NA    23    NA   11
4      4    NA    NA    15  28.0    NA    38   23
5      5    14  18.5    16  27.0    25    39   25
6      6    15    NA    17  27.5    NA    39   24

Note

The input in reproducible form:

Lines <- "Region X2012 X2013 X2014 X2015 X2016 X2017
1      1    10    11    12    13    14    15
2      2    NA    17    14    NA    23    NA
3      3    12    18    18    NA    23    NA
4      4    NA    NA    15    28    NA    38
5      5    14  18.5    16    27    25    39
6      6    NA    NA    NA    NA    NA    NA"
DF <- read.table(text = Lines)

Update

Fixed.

这篇关于在一行的第一个和最后一个观测值之间取差,其中每一行都是不同的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆