从R中的数据框中提取原始和重复结果 [英] Extract original and duplicate result(s) from a data frame in R
问题描述
我使用重复的结果来估计化学分析的测量不确定度.当我从实验室数据库中提取数据时,它主要由单一结果组成,但有些样品进行了两次测试,有些样品进行了两次以上(我看过多达12次).我想放弃所有单个分析,只保留重复的结果,但包括原始结果.
I use duplicate results to estimate the measurement uncertainty for chemical analyses. When I extract data from the laboratory database it consists largely of single results but with some samples tested twice, some more than twice (I have seen up to 12). I want to discard all the single analyses and just retain the duplicated results, but including the original result.
通过重复样本共有的样本编号来识别样本.
The samples are identified by a sample number that is common to the duplicate samples.
我可以使用duplicated()提取重复项,但是如何保留第一个结果呢?
I can pull out the duplicates using duplicated() but how to I retain the first result as well?
谢谢.
推荐答案
> dat <- data.frame(
id = sample(1:5, 10, replace = TRUE),
x = rnorm(10)
)
> dat
## id x
## 1 1 0.7060512
## 2 4 0.6804117
## 3 2 0.2395902
## 4 2 1.5352574
## 5 1 0.2376593
## 6 4 0.8019506
## 7 1 -1.0506505
## 8 5 1.0554555
## 9 3 0.3637685
## 10 5 -0.8404215
> dat[duplicated(dat$id) | duplicated(dat$id, fromLast = TRUE),]
## id x
## 1 1 0.7060512
## 2 4 0.6804117
## 3 2 0.2395902
## 4 2 1.5352574
## 5 1 0.2376593
## 6 4 0.8019506
## 7 1 -1.0506505
## 8 5 1.0554555
## 10 5 -0.8404215
这篇关于从R中的数据框中提取原始和重复结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!