从R中的重复索引值的面板中删除行 [英] remove rows from a panel with duplicate index-values in R
本文介绍了从R中的重复索引值的面板中删除行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这里有一些示例数据:
date <- c("2012-01-01","2012-01-02","2012-01-02","2012-01-04","2012-01-05","2012-01-03","2012-01-05","2012-01-05","2012-01-01","2012-01-01")
company <- c("A","A","A","A","A","B","B","B","C","C")
var1 <- c(-0.01, -0.013, 0.02, 0.032, -0.002, 0.022, 0.012, 0.031, -0.018, -0.034)
var2 <- c(43, 12, 34, 53, 45, 42, 23, 56, 87, 54)
mydf1 <- data.frame(date, company, var1, var2)
mydf1
# date company var1 var2
# 1 2012-01-01 A -0.010 43
# 2 2012-01-02 A -0.013 12
# 3 2012-01-02 A 0.020 34
# 4 2012-01-04 A 0.032 53
# 5 2012-01-05 A -0.002 45
# 6 2012-01-03 B 0.022 42
# 7 2012-01-05 B 0.012 23
# 8 2012-01-05 B 0.031 56
# 9 2012-01-01 C -0.018 87
# 10 2012-01-01 C -0.034 54
i想跑回归n与plm包,似乎没有工作,如果有重复(我得到这个错误:重复的couple(time-id)错误在pdim.default(index [[1]],index [[2]]) )。
这就是为什么我只想保留第一行,如果公司有重复的日期。
i want to run a regression with the plm package, which doesn't seem to work if there are duplicates (i get this error: duplicate couples (time-id) Error in pdim.default(index[[1]], index[[2]])).
that's why i want to keep only the first row, if there are duplicate dates for a company.
data.frame应该是像这样:
the data.frame should look like this:
# date company var1 var2
# 1 2012-01-01 A -0.010 43
# 2 2012-01-02 A -0.013 12
# 3 2012-01-04 A 0.032 53
# 4 2012-01-05 A -0.002 45
# 5 2012-01-03 B 0.022 42
# 6 2012-01-05 B 0.012 23
# 7 2012-01-01 C -0.018 87
我该怎么办?谢谢!
推荐答案
mydf1[!duplicated(mydf1[c("date", "company")]),]
## date company var1 var2
## 1 2012-01-01 A -0.010 43
## 2 2012-01-02 A -0.013 12
## 4 2012-01-04 A 0.032 53
## 5 2012-01-05 A -0.002 45
## 6 2012-01-03 B 0.022 42
## 7 2012-01-05 B 0.012 23
## 9 2012-01-01 C -0.018 87
这篇关于从R中的重复索引值的面板中删除行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文