如何提取某些行 [英] How to extract certain rows

查看:85
本文介绍了如何提取某些行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以您可以看到我下面有一个价格和天列

So As you can see I have a price and Day columns below

 Price  Day
    2   1
    5   2
    8   3
    11  4
    14  5
    17  6
    20  7
    23  8
    26  9
    29  10
    32  11
    35  12
    38  13
    41  14
    44  15
    47  16
    50  17
    53  18
    56  19
    59  20

然后我想要下面的输出

  Difference    Day
    12  5
    15  10
    15  15
    15  20

因此,现在我每隔5天就有价格差了……基本上就是从第一天开始减去第5天……然后从第5天开始减去第10天,依此类推……. 我已经编写了一个将数据分成5天间隔的代码...但是我想要让我在第一天减去第五天的代码...在第五天的第十天...等等 所以代码应该看起来像这样

So now I have the difference in prices every 5 days...it just basically subtracts the 5th day with the first day.....and then the 10th day with the 5th day etc.... I already made a code that will seperate my data into 5 day intervals...but I want the code that will let me minus the 5th with the 1st day....the 10th day with the 5th day...etc So the code should look something like this

difference<-tapply(Price[,1],Day, ____________)

因此,基本上,价格[,1]将是我的价格数据.....而天"是我创建的变量,可以将我的天"数据分成5天间隔.....我认为在空白部分中,我可以输入函数或其他变量,该变量或变量将让我减去第1天的价格和第5天的价格,然后减去第10天和第5天的价格...等等.....帮助我将我的日子分成几个时间段...只是如何做差异"部分....谢谢大家

So basically Price[,1] will be my Price data.....while "Day" is the variable that I created that will let me seperate my Day data into 5 day intervals.....I'm thinking that in the blank section I could put in the function or another variable that will let me subtract the 5th day with the 1st day prices and then the 10th day and 5th day prices...etc.....you dont have to help me to seperate my Days into intervals...just how to do "difference" section....thanks guys

推荐答案

这里是一个选项,假设您的data.frame被称为"SODF":

Here's one option, assuming your data.frame is called "SODF":

within(SODF[c(1, seq(5, nrow(SODF), 5)), ], { 
  Price <- diff(c(0, Price)) 
})[-1, ]
#    Price Day
# 5     12   5
# 10    15  10
# 15    15  15
# 20    15  20


第一步是基本子设置.根据您的描述和预期的答案,您需要第一行,然后是从第5行开始的每第五行:


The first step is basic subsetting. According to your description and expected answer, you want the first row, and then every fifth row starting from row 5:

> SODF[c(1, seq(5, nrow(SODF), 5)), ]
   Price Day
1      2   1
5     14   5
10    29  10
15    44  15
20    59  20

从那里,您可以在价格"列上使用diff,但是由于diff会导致向量的长度比输入的长度短一个,因此您需要填充"输入的向量,我做了diff(c(0, Price)).

From there, you can use diff on the "Price" column, but since diff will result in a vector that is one in length shorter than your input, you need to "pad" the input vector, which I did with diff(c(0, Price)).

# Correct values, but the number of rows needs to be 5
> diff(SODF[c(1, seq(5, nrow(SODF), 5)), "Price"])
[1] 12 15 15 15

然后,最后的[-1, ]只会删除多余的行.

Then, the [-1, ] at the end just deletes the extraneous row.

在下面的注释中,@ geektrader在注释中指出了(谢谢!),这是使用的一种替代方法:

In the comments below, @geektrader points out in the comments (thanks!), an alternative to using:

SODF[c(1, seq(5, nrow(SODF), 5)), ]

作为输入data.frame,您可以考虑使用以下内容:

as your input data.frame, you may consider using the following instead:

rbind(SODF[1,], SODF[$Day %% 5 == 0,] )

两种方法的区别在于,第一种方法只是按行号子集,而第二种方法是根据天"列中的子集,提取天"是5的倍数的行.第二种方法可能很有用,例如,当数据集中缺少行时.

The difference in the two approaches is that the first approach simply subsets by row number, while the second approach subsets according to the value in the "Day" column, extracting rows where "Day" is a multiple of 5. This second approach might be useful, for instance, when there are missing rows in the dataset.

这篇关于如何提取某些行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆