如何在R中添加新列和聚合值 [英] How to add a new column and aggregate values in R

查看:177
本文介绍了如何在R中添加新列和聚合值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对gnuplot完全陌生,仅尝试此操作是因为我需要学习它。我在三列中都有一个值,其中第一列代表文件名(日期和时间,一个小时的间隔),其余两列代表两个不同的实体Prop1和Prop2。

I am completely new to gnuplot and am only trying this because I need to learn it. I have a values in three columns where the first represents the filename (date and time, one hour interval) and the remaining two columns represent two different entities Prop1 and Prop2.

Datetime             Prop1        Prop2

20110101_0000.txt     2            5
20110101_0100.txt     2            5
20110101_0200.txt     2            5
...
20110101_2300.txt     2            5
20110201_0000.txt     2            5
20110101_0100.txt     2            5
...
20110201_2300.txt     2            5
...

我需要按一天中的小时(** _ 0100)汇总数据,即后四个数字。因此,我想创建另一个称为小时的列,该列告诉我一天中的小时。这意味着 0000 = 0h,0100 = 1h,...... 2200 = 22h

I need to aggregate the data by the hour of the day (the **_0100) which is the last four numeric digits. So, I want to create another column called hour which tells me the hour of the day. It means 0000 = 0h, 0100 = 1h, ...... 2200 = 22h etc.

I然后希望每小时获取Prop1和Prop2的总和,因此最终得到类似的结果。

I then want to get the sum of Prop1 and Prop2 for each hour, so in the end get something like.

Hour  Prop1   Prop2
0h     120     104
1h     230     160
...
10h    90      110
...
23h    100    200 

并获得Prop1和Prop2的线图。

and the get a line plot of Prop1 and Prop2.

推荐答案

带有gsub的一般解决方案:

A general solution with gsub :

Data$Hour <- gsub(".+_(\\d+).txt","\\1",Data$Datetime)

编辑:

您可以使用 Data $ Hour<-substr(Data $ Hour,1,2)来只花一个小时。如评论中所述,如果您在Datetime中始终具有完全相同的结构,则可以立即使用 substr()

You can use Data$Hour <- substr(Data$Hour,1,2) to get just the hour. As said in the comments, if you always have exactly the same structure in Datetime, you could use substr() immediately:

Data$Hour <- substr(Data$Datetime,10,11)

然后您可以使用聚合 tapply by ,...随便做什么。要对Prop1和Prop2进行汇总,可以使用汇总,例如:

Then you can use aggregate, tapply, by, ... whatever to do what you want. To sum both Prop1 and Prop2, you can use aggregate, eg:

aggregate(Data[2:3],list(Data$Hour),sum)

与数据集:

zz<-textConnection("Datetime             Prop1        Prop2
20110101_0000.txt     2            5
20110101_0100.txt     2            5
20110101_0200.txt     2            5
20110101_2300.txt     2            5
20110201_0000.txt     2            5
20110201_0100.txt     2            5
20110201_0200.txt     2            5
20110201_2300.txt     2            5")
Data <- read.table(zz,header=T,as.is=T)

这篇关于如何在R中添加新列和聚合值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆