将时间序列数据转换为横截面数据的最有效方法是什么? [英] What's the most efficient way to convert a time-series data into a cross-sectional one?

查看:145
本文介绍了将时间序列数据转换为横截面数据的最有效方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是东西,下面有数据集,其中 date 是索引:

Here's the thing, I have the dataset below where date is the index:

date            value
2020-01-01      100
2020-02-01      140
2020-03-01      156
2020-04-01      161
2020-05-01      170
.
.
.

我想在另一个数据集中对其进行转换:

And I want to transform it in this other dataset:

value_t0    value_t1    value_t2    value_t3    value_t4 ...
100         NaN         NaN         NaN         NaN      ...
140         100         NaN         NaN         NaN      ...
156         140         100         NaN         NaN      ...
161         156         140         100         NaN      ...
170         161         156         140         100      ...

首先,我考虑过使用pandas.pivot_table来做某事,但这只会提供按某列分组的不同布局,这并不是我想要的.后来,我考虑使用pandasql并应用"case when",但是那将行不通,因为我必须输入数十行代码.所以我被困在这里.

First I thought about using pandas.pivot_table to do something, but that would just provide a different layout grouped by some column, which is not exactly what I want. Later, I thought about using pandasql and apply 'case when', but that wouldn't work because I would have to type dozens of lines of code. So I'm stuck here.

推荐答案

尝试一下:

new_df = pd.DataFrame({f"value_t{i}": df['value'].shift(i) for i in range(len(df))})

系列 .shift(n)方法可以通过将所有内容下移并填写上面的NaN来获得所需输出的一列.因此,我们通过使用字典理解来遍历原始数据帧,向它提供 {列名:列数据,...} 形式的字典,从而构建了一个新的数据帧.

The series .shift(n) method can get you a single column of your desired output by shifting everything down and filling in NaNs above. So we're building a new dataframe by feeding it a dictionary of the form {column name: column data, ...}, by using dictionary comprehension to iterate through your original dataframe.

这篇关于将时间序列数据转换为横截面数据的最有效方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆