Python面板数据 [英] Python Panel Data
问题描述
我通常使用Stata,但现在想使用Python并拼命尝试创建Pandel数据集.我尝试了pandas.panel,但无法正常工作. 我有以下数据集:
I am usually using Stata but now want to use Python and desperately trying to create a pandel data set. I tried pandas.panel but do not get it to work. I have the following dataset:
date id1 id2
2000 100 50
2001 101 48
现在我要使它看起来像这样:
Now I want to make it look like this:
date id variable
2000 1 100
2000 2 101
2001 1 50
2001 2 48
接下来,我想确定一个时间和id变量来运行某些面板功能.我也试过了dataframe.stack(),但这并没有根据ID进行排序.我该怎么做?还是我在这里错过了熊猫中一些不错的时间序列功能?
Next, I want to identify a time and id variable to run some panel function. I also tried dataframe.stack(), but this doesn't sort according to the id. How do I do this or am I missing some nice time-series function in pandas here?
很抱歉.我相信这已经在某处得到了回答,但是我现在尝试了几个小时,无法解决.
Sorry for the question. I am sure this has been answered somewhere, but I tried several hours now and cannot figure it out.
推荐答案
给出输入数据:
data = [
{"date": 2000, "id1": 100, "id2": 50},
{"date": 2001, "id1": 101, "id2": 48}
]
或
data = {
"date": [2000, 2001],
"id1": [100, 101],
"id2": [50, 48],
}
如此
df = pd.DataFrame(data)
df
"融化"的熊猫DataFrame:
"melt" the pandas DataFrame:
melted = pd.melt(df, id_vars="date", var_name="id", value_name="variable")
# Optional amendments
melted["id"] = melted["id"].str.replace("id", "")
melted.sort_values(by="date", inplace=True)
melted.reset_index(inplace=True, drop=True)
melted
melted
输出
其他参考文献:Wickham,H. 整理数据 ,《统计软件杂志》,2014年10月59日.
Additional Reference: Wickham, H. Tidy Data, The Journal of Statistical Software, 10, 59, 2014.
这篇关于Python面板数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!