如何在R中找到平衡的面板数据(也就是,如何在给定窗口中找到面板中的哪些条目是完整的) [英] How to find balanced panel data in R (aka, how to find which entries in panel are complete over given window)

查看:237
本文介绍了如何在R中找到平衡的面板数据(也就是,如何在给定窗口中找到面板中的哪些条目是完整的)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大量来自Compustat的数据.我向其中添加了一些手工收集的数据(严重地是从一堆旧书中手工收集的).但是我不想手工收集整个面板,而只是手工收集一个随机选择的子集.为了找到更大的集合(我从中随机选择),我想从Compustat的平衡面板开始.

I have a big panel of data from Compustat. To it I am adding some hand-collected data (seriously hand-collected from a stack of old books). But I don't want to hand-collect for the entire panel, only a randomly selected subset. To find the larger set (from which I'm randomly selecting) I would like to start with the balanced panel from Compustat.

我看到了plm库,用于处理不平衡的面板,但我想使其保持平衡.有没有一种干净的方法来做到这一点,而不是寻找并淘汰不在样本期内的公司(小组讨论中的个人)?谢谢!

I see the plm library for working with unbalanced panels, but I would like to keep it balanced. Is there a clean way to do this short of searching for and throwing out firms (individuals in panelspeak) that don't run the sample period? Thanks!

推荐答案

经过一番思考,有一种更简便的方法.

After a second thought, there is a much easier way for doing this.

看看这个:

data.with.only.complete.subjects.data <- function(xx, subject.column, number.of.observation.a.subject.should.have)
{
    subjects <- xx[,subject.column]
    num.of.observations.per.subject <- table(subjects)
    subjects.to.keep <- names(num.of.observations.per.subject)[num.of.observations.per.subject == number.of.observation.a.subject.should.have]

    subset.by.me <- subjects %in%   subjects.to.keep

    new.xx <- xx[subset.by.me ,]

    return(new.xx)
}

xx <- data.frame(subject = rep(1:4, each = 3),
            observation.per.subject = rep(rep(1:3), 4))
xx.mis <- xx[-c(2,5),]

data.with.only.complete.subjects.data(xx.mis , 1, 3)

这篇关于如何在R中找到平衡的面板数据(也就是,如何在给定窗口中找到面板中的哪些条目是完整的)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆