pandas :如何相互绘制年度数据 [英] Pandas: how to plot yearly data on top of each other
问题描述
我有一系列按时间值索引的数据(浮点数),我想获取该系列的大块并将它们相互绘制.例如,假设我在20周的时间内大约每10分钟获取一次股价,并且我想绘制20条线的股价来查看每周模式.所以我的X轴是一个星期,我有20条线(对应于该周的价格).
I have a series of data indexed by time values (a float) and I want to take chunks of the series and plot them on top of each other. So for example, lets say I have stock prices taken about every 10 minutes for a period of 20 weeks and I want to see the weekly pattern by plotting 20 lines of the stock prices. So my X axis is one week and I have 20 lines (corresponding to the prices during the week).
已更新
索引不是均匀间隔的值,而是一个浮点数.就像这样:
The index is not a uniformly spaced value and it is a floating point. It is something like:
t = np.arange(0,12e-9,12e-9/1000.0)
noise = np.random.randn(1000)/1e12
cn = noise.cumsum()
t_noise = t+cn
y = sin(2*math.pi*36e7*t_noise) + noise
df = DataFrame(y,index=t_noise,columns=["A"])
df.plot(marker='.')
plt.axis([0,0.2e-8,0,1])
因此索引的间距不均匀.我正在处理来自模拟器的电压与时间的数据.我想知道如何创建一个时间窗口T,并将df拆分为T long的块,并将它们绘制在彼此的顶部.因此,如果数据的长度为20 * T,那么我在同一图中将有20条线.
So the index is not uniformly spaced. I'm dealing with voltage vs time data from a simulator. I would like to know how to create a window of time, T, and split df into chunks of T long and plot them on top of each other. So if the data was 20*T long then I would have 20 lines in the same plot.
很抱歉造成混乱;我用股票类比的方式认为可能会有所帮助.
Sorry for the confusion; I used the stock analogy thinking it might help.
推荐答案
假设以pandas.TimeSeries
对象为起点,则可以分组
ISO周编号和ISO工作日的元素
datetime.date.isocalendar()
.以下忽略ISO年的语句汇总了每天的最后一个样本.
Assuming a pandas.TimeSeries
object as the starting point, you can group
elements by ISO week number and ISO weekday with
datetime.date.isocalendar()
. The following statement, which ignores ISO year, aggregates the last sample of each day.
In [95]: daily = ts.groupby(lambda x: x.isocalendar()[1:]).agg(lambda s: s[-1])
In [96]: daily
Out[96]:
key_0
(1, 1) 63
(1, 2) 91
(1, 3) 73
...
(20, 5) 82
(20, 6) 53
(20, 7) 63
Length: 140
执行下一步的方法可能更简洁,但目标是将索引从元组数组更改为MultiIndex对象.
There may be cleaner way to perform the next step, but the goal is to change the index from an array of tuples to a MultiIndex object.
In [97]: daily.index = pandas.MultiIndex.from_tuples(daily.index, names=['W', 'D'])
In [98]: daily
Out[98]:
W D
1 1 63
2 91
3 73
4 88
5 84
6 95
7 72
...
20 1 81
2 53
3 78
4 64
5 82
6 53
7 63
Length: 140
最后一步是从工作日取消堆叠" MultiIndex,为每个工作日创建列,并用缩写替换工作日数字,以提高可读性.
The final step is to "unstack" weekday from the MultiIndex, creating columns for each weekday, and replace the weekday numbers with an abbreviation, to improve readability.
In [102]: dofw = "Mon Tue Wed Thu Fri Sat Sun".split()
In [103]: grid = daily.unstack('D').rename(columns=lambda x: dofw[x-1])
In [104]: grid
Out[104]:
Mon Tue Wed Thu Fri Sat Sun
W
1 63 91 73 88 84 95 72
2 66 77 96 72 56 80 66
...
19 56 69 89 69 96 73 80
20 81 53 78 64 82 53 63
要为每周创建一个折线图,请转置数据框,以使列为周数,行为工作日(请注意,可以通过在上一步中堆叠周数代替工作日来避免此步骤),并且呼叫plot
.
To create a line plot for each week, transpose the dataframe, so the columns are week numbers and rows are weekdays (note this step can be avoided by unstacking week number, in place of weekday, in the previous step), and call plot
.
grid.T.plot()
这篇关于 pandas :如何相互绘制年度数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!