根据基于通过行的值是否更改的标准提取数据帧中的行 [英] Extracting row from a data frame according a criterion based if values through rows changed or not
问题描述
以下最大的表格包含了渔民配额所有权的数据(以及其他变量,'cpue')。我根据他们拥有的配额数量('类别')对渔民进行了分类。渔民可能增加或减少拥有的配额数量;因此,他们的所有权类别也可能会改变。每次渔民改变所有权时,我都需要提取信息。这是配额数量已经增加或减少的一年前的排。例如,如果2000年和2001年的配额数量分别为20和45,我需要2000年的信息(行)。此外,我需要一个新的专栏来说明渔民的所有权水平移动。下面的第二个表格显示了我需要使用提取的行创建的新数据框。
我的数据:
ID渔民年份qtty类别cpue
1 1 1998 13 1 0.5994452
2 1 1999 13 1 0.6176183
3 1 2000 13 1 0.6871764
4 1 2001 20 2 0.3228005
5 1 2002 20 2 0.6505336
6 1 2003 20 2 0.8615834
7 1 2004 20 2 0.6871764
8 1 2005 20 2 0.7469739
9 1 2006 20 2 0.7380952
10 1 2007 45 3 0.7516396
11 1 2008 45 3 0.6808454
12 1 2009 45 3 0.6734158
13 1 2010 45 3 0.70367
14 1 2011 45 3 0.5434572
15 1 2012 45 3 0.6181238
16 2 2000 50 3 0.5191856
17 2 2001 50 3 0.6098226
18 2 2002 50 3 1.0018519
19 2 2003 50 3 1.2049724
20 2 2004 50 3 0.5857708
21 2 2005 10 1 0。 6744186
22 2 2006 10 1 0.8123333
23 2 2007 10 1 0.3228005
24 2 2008 10 1 0.6505336
25 2 2009 10 1 0.8615834
26 2 2010 0 4 0
27 3 1998 25 2 0.7469739
28 3 1999 25 2 0.7380952
29 3 2000 25 2 0.7516396
30 3 2001 25 2 0.6808454
31 3 2002 10 1 0.6734158
32 3 2003 10 1 0.70367
33 3 2004 10 1 0.5434572
34 3 2005 10 1 0.6181238
35 3 2006 45 3 0.4698849
36 3 2007 45 3 1.0714286
37 3 2008 45 3 1.242439
38 3 2009 45 3 1.0614261
39 3 2010 45 3 0.9761391
40 3 2011 45 3 1.0041898
41 3 2012 45 3 0.9429851
42 4 2005 45 3 0.9310958
43 4 2006 50 3 0.8932985
44 4 2007 50 3 0.7867613
45 4 2008 20 2 0.7994713
46 4 2009 20 2 0.9368927
47 4 2010 10 1 0.8123333
48 4 2011 0 4 0
49 5 1998 20 2 0.4698849
50 5 1999 20 2 1.0714286
51 5 2000 20 2 1.242439
52 5 2001 20 2 1.0614261
53 5 2002 20 2 0.9761391
54 5 2003 20 2 1.0041898
55 5 2004 20 2 0.7469739
56 5 2005 0 4 0.7380952
57 6 2000 55 3 0.7516396
58 6 2001 55 3 0.6808454
59 6 2002 55 3 0.6734158
60 6 2003 55 3 0.6505336
61 6 2004 55 3 0.8615834
62 6 2005 55 3 0.6871764
63 6 2006 55 3 0.6181238
64 6 2007 0 4 0
这是我需要的:
ID fisher year qtty类别cpue类别2
3 1 2000 13 1 0.6871764 1
25 2 2009 10 1 0.8615834 1
34 3 2005 10 1 0.6181238 1
47 4 2010 10 1 0.8123333 1
9 1 2006 20 2 0。 7380952 2
30 3 2001 25 2 0.6808454 3
46 4 2009 20 2 0.9368927 3
44 4 2007 50 3 0.7867613 4
20 2 2004 50 3 0.5857708 5
25 2 2009 10 1 0.8615834 6
47 4 2010 10 1 0.8123333 6
55 5 2004 20 2 0.7469739 7
63 6 2006 55 3 0.6181238 8
所有权类别为1(1-15配额),2(16-40配额),3(> 40配额)和4(0配额,那些退出渔业的人)。
我需要的新类别应显示不同所有权类别之间的转换(例如,类别1是从所有权级别1到所有权级别2的转换)。下表中的完整详细信息:
从至类别2
1 2 1
2 3 2
2 1 3
3 2 4
3 1 5
1 0 6
2 0 7
3 0 8
感谢!!
解决方案c $ c> data 作为您的第一个数据框,
cats 作为分类表: > w(数据$ fisher)== 0& diff(数据$类别)!= 0)
>合并(data.frame(数据[W],从数据= $类别[W],为=数据$类别[W + 1]),猫,all.x = T)[ - (1:2)]
ID渔民年份qtty类别cpue类别2
1 3 1 2000 13 1 0.6871764 1
2 34 3 2005 10 1 0.6181238 NA
3 25 2 2009 10 1 0.8615834 6
4 47 4 2010 10 1 0.8123333 6
5 46 4 2009 20 2 0.9368927 3
6 30 3 2001 25 2 0.6808454 3
7 9 1 2006 20 2 0.7380952 2
8 55 5 2004 20 2 0.7469739 7
9 20 2 2004 50 3 0.5857708 5
10 44 4 2007 50 3 0.7867613 4
11 63 6 2006 55 3 0.6181238 8
I have unsuccessfully tried to do the task described below, so any help will be much appreciated.
The largest table below contains data of quota ownership of fishers (and other variable, ’cpue’) across the time. I categorized fishers according the number of quotas that they own (‘category’). Fishers may increase or reduce the number of owned quotas; therefore, their ownership category also may change. I need extract the information every time when fishers change their ownership. It is the row of the year before when the number of quota was already increased or decreased. For instance, if the number of quotas was 20 and 45 during the years 2000 and 2001 respectively, I need the information (row) of the year 2000. Additionally, I need a new column with a category to indicate amongst what ownership levels fishers are moving. The second table below shows the new data frame that I need create with the extracted rows.
My data:
ID fisher year qtty category cpue 1 1 1998 13 1 0.5994452 2 1 1999 13 1 0.6176183 3 1 2000 13 1 0.6871764 4 1 2001 20 2 0.3228005 5 1 2002 20 2 0.6505336 6 1 2003 20 2 0.8615834 7 1 2004 20 2 0.6871764 8 1 2005 20 2 0.7469739 9 1 2006 20 2 0.7380952 10 1 2007 45 3 0.7516396 11 1 2008 45 3 0.6808454 12 1 2009 45 3 0.6734158 13 1 2010 45 3 0.70367 14 1 2011 45 3 0.5434572 15 1 2012 45 3 0.6181238 16 2 2000 50 3 0.5191856 17 2 2001 50 3 0.6098226 18 2 2002 50 3 1.0018519 19 2 2003 50 3 1.2049724 20 2 2004 50 3 0.5857708 21 2 2005 10 1 0.6744186 22 2 2006 10 1 0.8123333 23 2 2007 10 1 0.3228005 24 2 2008 10 1 0.6505336 25 2 2009 10 1 0.8615834 26 2 2010 0 4 0 27 3 1998 25 2 0.7469739 28 3 1999 25 2 0.7380952 29 3 2000 25 2 0.7516396 30 3 2001 25 2 0.6808454 31 3 2002 10 1 0.6734158 32 3 2003 10 1 0.70367 33 3 2004 10 1 0.5434572 34 3 2005 10 1 0.6181238 35 3 2006 45 3 0.4698849 36 3 2007 45 3 1.0714286 37 3 2008 45 3 1.242439 38 3 2009 45 3 1.0614261 39 3 2010 45 3 0.9761391 40 3 2011 45 3 1.0041898 41 3 2012 45 3 0.9429851 42 4 2005 45 3 0.9310958 43 4 2006 50 3 0.8932985 44 4 2007 50 3 0.7867613 45 4 2008 20 2 0.7994713 46 4 2009 20 2 0.9368927 47 4 2010 10 1 0.8123333 48 4 2011 0 4 0 49 5 1998 20 2 0.4698849 50 5 1999 20 2 1.0714286 51 5 2000 20 2 1.242439 52 5 2001 20 2 1.0614261 53 5 2002 20 2 0.9761391 54 5 2003 20 2 1.0041898 55 5 2004 20 2 0.7469739 56 5 2005 0 4 0.7380952 57 6 2000 55 3 0.7516396 58 6 2001 55 3 0.6808454 59 6 2002 55 3 0.6734158 60 6 2003 55 3 0.6505336 61 6 2004 55 3 0.8615834 62 6 2005 55 3 0.6871764 63 6 2006 55 3 0.6181238 64 6 2007 0 4 0
This is what I need:
ID fisher year qtty category cpue category2 3 1 2000 13 1 0.6871764 1 25 2 2009 10 1 0.8615834 1 34 3 2005 10 1 0.6181238 1 47 4 2010 10 1 0.8123333 1 9 1 2006 20 2 0.7380952 2 30 3 2001 25 2 0.6808454 3 46 4 2009 20 2 0.9368927 3 44 4 2007 50 3 0.7867613 4 20 2 2004 50 3 0.5857708 5 25 2 2009 10 1 0.8615834 6 47 4 2010 10 1 0.8123333 6 55 5 2004 20 2 0.7469739 7 63 6 2006 55 3 0.6181238 8
The ownership categories are 1 (1-15 quotas), 2 (16-40 quotas), 3(>40 quotas) and 4(0 quotas, those who exited the fishery). The new category that I need should show the transition amongst the different ownership categories (e.g. category 1 is the transition from the ownership level 1 to the ownership level 2). Full details in the following table:
From to category2 1 2 1 2 3 2 2 1 3 3 2 4 3 1 5 1 0 6 2 0 7 3 0 8
Thanks!!
解决方案With
data
as your first data frame andcats
as the category table:> w<-which(diff(data$fisher)==0 & diff(data$category)!= 0) > merge(data.frame(data[w,],From=data$category[w],to=data$category[w+1]),cats,all.x=T)[,-(1:2)] ID fisher year qtty category cpue category2 1 3 1 2000 13 1 0.6871764 1 2 34 3 2005 10 1 0.6181238 NA 3 25 2 2009 10 1 0.8615834 6 4 47 4 2010 10 1 0.8123333 6 5 46 4 2009 20 2 0.9368927 3 6 30 3 2001 25 2 0.6808454 3 7 9 1 2006 20 2 0.7380952 2 8 55 5 2004 20 2 0.7469739 7 9 20 2 2004 50 3 0.5857708 5 10 44 4 2007 50 3 0.7867613 4 11 63 6 2006 55 3 0.6181238 8
这篇关于根据基于通过行的值是否更改的标准提取数据帧中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!