R根据事件更新值 [英] R update values based on event

查看:39
本文介绍了R根据事件更新值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近发布了这个问题,该问题已经与我在笔记本电脑上本地使用的Mysql数据库有关。由于我没有在Mysql中找到解决该问题的方法,而其他人似乎也没有找到解决方法,因此我想再次发布它,但现在与R有关。我将数据库与 RMysql一起使用。包。这是问题所在:

I recently posted this question already related to my Mysql database which I am using locally on my laptop. Since I did not find a solution to the problem in Mysql and others did not seem to find one either I would like to post it again but now related to R. I use the database with the "RMysql" package. Here is the problem:

我有一个数据库表,其中包含几年来在多个字段上不同农业实践的数据。农民根据完成某项活动的日期在此输入信息。一个简化的示例如下所示,其中ID是一个自动递增的值。

I have a database table that contain data of different farming pratices during several years on several fields. Farmers entered there information based on the dates when a certain activity was done. A simplified example would look like this, where ID is an auto-increment value.

ID|Field|Date      |Activity  |
 1|A    |2012/08/01|Tillage   |
 2|A    |2012/08/24|Seeding   |
 3|A    |2013/03/05|Spraying  |
 4|A    |2013/03/05|Fertilizer|
 5|A    |2013/07/25|Harvest   |
 6|B    |2012/09/01|Tillage   |
 7|B    |2012/09/05|Seeding   |
 8|B    |2013/04/05|Spraying  |
 9|B    |2013/07/28|Harvest   |
10|B    |2010/08/24|Tillage   |
11|B    |2010/09/29|Seeding   |
12|B    |2011/05/01|Fertilizer|
13|B    |2011/07/12|Harvest   |
14|A    |2011/09/01|Seeding   |
15|A    |2011/10/10|Spraying  |
16|A    |2012/04/10|Fertilizer|
17|A    |2012/08/02|Harvest   |

现在,我想添加一个名为 Season的列,该字段仅包含收获田地的年份,但会针对上一个收获期至此收获的各个田地自动更新所有其他活动。请注意,不同田间的收获日期不同。另一点是,播种可能是在2011年完成的,而田间是在2012年收获的,因此,所有这些都应称为2012赛季。在此示例中,输出看起来像: p>

Now I would like to add a column called "Season" that contains only the year when a field was harvested but is automatically updated for all other activites between the last harvest and this harvest for individual fields. Note that harvest dates are different for different fields.Another point is that the seeding was maybe done in 2011 but the field was harvested in 2012 and all this should then be called season 2012. In this example the output would ideally look like:

ID|Field|Date      |Activity  |Season
 1|A    |2012/08/01|Tillage   |2013
 2|A    |2012/08/24|Seeding   |2013
 3|A    |2013/03/05|Spraying  |2013
 4|A    |2013/03/05|Fertilizer|2013
 5|A    |2013/07/25|Harvest   |2013
 6|B    |2012/09/01|Tillage   |2013
 7|B    |2012/09/05|Seeding   |2013
 8|B    |2013/04/05|Spraying  |2013
 9|B    |2013/07/28|Harvest   |2013
10|B    |2010/08/24|Tillage   |2011
11|B    |2010/09/29|Seeding   |2011
12|B    |2011/05/01|Fertilizer|2011
13|B    |2011/07/12|Harvest   |2011
14|A    |2011/09/01|Seeding   |2012
15|A    |2011/10/10|Spraying  |2012
16|A    |2012/04/10|Fertilizer|2012
17|A    |2012/08/02|Harvest   |2012

有人可以帮忙吗?

编辑:

如果我开始记录更多信息并提供所有活动,我需要更改什么?一个单独的专栏?
我尝试过:

What would I need to change if I start recording more information and give all "Activities" a separate column? I tried:

DF <- read.table(text="ID|Field|Date      |Tillage|Seeding|Fertilizer|Spraying|Harvest
 1|A    |2012/08/01|Plough   |NA|NA|NA|NA
 2|A    |2012/08/24|NA   |Wheat|NA|NA|NA
 3|A    |2013/03/05|NA  |NA|NA|ProduktA|NA
 4|A    |2013/03/05|NA|NA|TypeB|NA|NA
 5|A    |2013/07/25|NA   |NA|NA|NA|9t
 6|B    |2012/09/01|Plough   |NA|NA|NA|NA
 7|B    |2012/09/05|NA   |Barley|NA|NA|NA
 8|B    |2013/04/05|NA  |NA|NA|ProductB|NA
 9|B    |2013/07/28|NA   |NA|NA|NA|10t
10|B    |2010/08/24|Cultivator   |NA|NA|NA|NA
11|B    |2010/09/29|NA   |NA|NA|NA|NA
12|B    |2011/05/01|NA|NA|TypeB|NA|NA
13|B    |2011/07/12|NA   |NA|NA|NA|6t
14|A    |2011/09/01|NA   |Barley|NA|NA|NA
15|A    |2011/10/10|NA  |NA|NA|ProductC|NA
16|A    |2012/04/10|NA|NA|TypeA|NA|NA
17|A    |2012/08/02|NA   |NA|NA|NA|7t|", 
                 sep="|", header=TRUE, stringsAsFactors=FALSE)

DT <- data.table(DF)
DT[, Harvest:=gsub(" ", "", Harvest, fixed=TRUE)]
DT[, Date:=as.POSIXct(Date)]
setkeyv(DT, c("Field", "Date"))
DT[, Season:=cumsum(c("", !is.na(head(Harvest, -1)))), by=Field]
DT[, Season:=max(year(Date)), by=list(Field, Season)]

不幸的是,这没有用。有人能解决这个问题吗?

This does unfortunately not work. Can someone solve this,too?

推荐答案

假设每个田地每个季节都有收获(危险的假设,但是否则,我看不到这样做的方法):

Assuming there was a harvest recorded for each season on each field (a dangerous assumption, but otherwise I see no way to do this):

DF <- read.table(text="ID|Field|Date      |Activity  |
 1|A    |2012/08/01|Tillage   |
 2|A    |2012/08/24|Seeding   |
 3|A    |2013/03/05|Spraying  |
 4|A    |2013/03/05|Fertilizer|
 5|A    |2013/07/25|Harvest   |
 6|B    |2012/09/01|Tillage   |
 7|B    |2012/09/05|Seeding   |
 8|B    |2013/04/05|Spraying  |
 9|B    |2013/07/28|Harvest   |
10|B    |2010/08/24|Tillage   |
11|B    |2010/09/29|Seeding   |
12|B    |2011/05/01|Fertilizer|
13|B    |2011/07/12|Harvest   |
14|A    |2011/09/01|Seeding   |
15|A    |2011/10/10|Spraying  |
16|A    |2012/04/10|Fertilizer|
17|A    |2012/08/02|Harvest   |", 
                 sep="|", header=TRUE, stringsAsFactors=FALSE)

library(data.table)
DT <- data.table(DF[, 1:4])
DT[, Activity:=gsub(" ", "", Activity, fixed=TRUE)]
DT[, Date:=as.POSIXct(Date)]
setkeyv(DT, c("Field", "Date"))
DT[, Season:=cumsum(c("", head(Activity, -1)) == "Harvest"), by=Field]
DT[, Season:=max(year(Date)), by=list(Field, Season)]
#     ID Field       Date   Activity Season
#  1: 14 A     2011-09-01    Seeding   2012
#  2: 15 A     2011-10-10   Spraying   2012
#  3: 16 A     2012-04-10 Fertilizer   2012
#  4:  1 A     2012-08-01    Tillage   2012
#  5: 17 A     2012-08-02    Harvest   2012
#  6:  2 A     2012-08-24    Seeding   2013
#  7:  3 A     2013-03-05   Spraying   2013
#  8:  4 A     2013-03-05 Fertilizer   2013
#  9:  5 A     2013-07-25    Harvest   2013
# 10: 10 B     2010-08-24    Tillage   2011
# 11: 11 B     2010-09-29    Seeding   2011
# 12: 12 B     2011-05-01 Fertilizer   2011
# 13: 13 B     2011-07-12    Harvest   2011
# 14:  6 B     2012-09-01    Tillage   2013
# 15:  7 B     2012-09-05    Seeding   2013
# 16:  8 B     2013-04-05   Spraying   2013
# 17:  9 B     2013-07-28    Harvest   2013

PS:我认为这应该滚动连接也可以(而且效率更高),但无法立即看到如何做到这一点。

PS: I think this should also be possible (and more efficient) with a rolling join, but couldn't see immediately how to do that.

这篇关于R根据事件更新值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆