堆叠和整形DataFrame( pandas )的切片而无需循环 [英] Stacking and shaping slices of DataFrame (pandas) without looping
问题描述
我具有以下形式的DataFrame:
I have a DataFrame of the following form:
var1 var2 var3 day
0 -0.001284819 0.00138089 1.022781 1
1 -0.001310201 0.001377473 1.022626 1
2 -0.001330947 0.001374873 1.022477 2
3 -0.0013596 0.001430423 1.022385 2
4 -0.001361913 0.00144389 1.02228 3
5 -0.001371761 0.001440607 1.022161 3
6 -0.001393955 0.00143729 1.022017 4
7 -0.001431099 0.00143434 1.021908 4
8 -0.001466792 0.00143334 1.021749 5
9 -0.001491 0.00143159 1.021602 5
每个变量var1到var3是一个时间序列,每天记录两个样本.我正在尝试将此DataFrame转换为如下形式:
Each variable var1 to var3 is a time series and two samples are recorded every day. I'm trying to transform this DataFrame into something like this:
day 1 2 3 4 5
0 var1 -0.001284819 -0.001330947 -0.001361913 -0.001393955 -0.001466792
1 -0.001310201 -0.0013596 -0.001371761 -0.001431099 -0.001491
2 var2 0.00138089 0.001374873 0.00144389 0.00143729 0.00143334
3 0.001377473 0.001430423 0.001440607 0.00143434 0.00143159
4 var3 1.022781 1.022477 1.02228 1.022017 1.021749
5 1.022626 1.022385 1.022161 1.021908 1.021602
现在每列都是一天,并且在每一天之内,每个变量仍然与原始时间序列中的顺序相同.
Now each column is a day, and within each day, each variable is still in the same order as in the original time series.
我一直在尝试找到一种方法来使用这种透视,堆叠和此类操作来避免循环,但是到目前为止我还无法管理.
I've been trying to figure out a way of achieving this result using pivoting, stacking and operations of this sort to avoid looping but so far I haven't be able to manage.
对于如何解决此问题的任何建议/提示/想法,将不胜感激:)
Any suggestions/hints/ideas on how to solve this would be very much appreciated :)
推荐答案
不一定是最漂亮的,但过去我做过类似的事情
It's not necessarily the prettiest, but in the past I've done things like
df = pd.read_csv("vd.csv", sep="\s+")
d2 = pd.melt(df, id_vars="day")
d2["sample"] = d2.groupby(["variable", "day"])["day"].rank("first")
d3 = d2.pivot_table(index=["variable", "sample"], columns="day")
给出
>>> d3
value
day 1 2 3 4 5
variable sample
var1 1 -0.001285 -0.001331 -0.001362 -0.001394 -0.001467
2 -0.001310 -0.001360 -0.001372 -0.001431 -0.001491
var2 1 0.001381 0.001375 0.001444 0.001437 0.001433
2 0.001377 0.001430 0.001441 0.001434 0.001432
var3 1 1.022781 1.022477 1.022280 1.022017 1.021749
2 1.022626 1.022385 1.022161 1.021908 1.021602
[6 rows x 5 columns]
(虽然说实话,我认为Andy的方法比较笨拙.我将其保留在这里,因为过去在更困难的情况下,melt-modify-pivot模式对我来说非常有用.)
(Although to be honest, I think Andy's way is slicker. I'll leave this here though because the melt-modify-pivot pattern has proved pretty useful for me in the past in harder cases.)
这篇关于堆叠和整形DataFrame( pandas )的切片而无需循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!