提取数据框中变量第一次出现的行 [英] Extract rows for the first occurrence of a variable in a data frame
问题描述
我有一个包含两个变量 Date 和 Taxa 的数据框,我想获取每个分类群第一次出现的日期.在由 172 行组成的数据框中有 9 个不同的日期和 40 个不同的分类群,但我的答案应该只有 40 行.
I have a data frame with two variables, Date and Taxa and want to get the date for the first time each taxa occurs. There are 9 different dates and 40 different taxa in the data frame consisting of 172 rows, but my answer should only have 40 rows.
Taxa 是一个因子,Date 是一个日期.
Taxa is a factor and Date is a date.
例如,我的数据框(称为物种")设置如下:
For example, my data frame (called 'species') is set up like this:
Date Taxa
2013-07-12 A
2011-08-31 B
2012-09-06 C
2012-05-17 A
2013-07-12 C
2012-09-07 B
我会寻找这样的答案:
Date Taxa
2012-05-17 A
2011-08-31 B
2012-09-06 C
我尝试使用:
t.first <- species[unique(species$Taxa),]
它给了我正确的行数,但重复了分类群.如果我只使用 unique(species$Taxa) 它似乎给了我正确的答案,但是我不知道它第一次发生的日期.
and it gave me the correct number of rows but there were Taxa repeated. If I just use unique(species$Taxa) it appears to give me the right answer, but then I don't know the date when it first occurred.
感谢您的帮助.
推荐答案
t.first <- species[match(unique(species$Taxa), species$Taxa),]
应该给你你正在寻找的东西.match
返回比较向量中第一个匹配项的索引,从而为您提供所需的行.
should give you what you're looking for. match
returns indices of the first match in the compared vectors, which give you the rows you need.
这篇关于提取数据框中变量第一次出现的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!