绘制Pandas Group的结果 [英] Plotting results of Pandas GroupBy

查看：169 发布时间：2018/5/30 13:37:28 python matplotlib group-by pandas data-analysis

本文介绍了绘制Pandas Group的结果的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我开始学习熊猫，并试图找到最具Pythonic（或panda-thonic？）方式来完成某些任务。

假设我们有一个DataFrame包含列A，B和C.

列A包含布尔值：每行的A值为true或false。 li>

B列有一些我们想要绘制的重要值。

我们想要发现的是如果行的A值设置为false，那么B值之间的细微差别与A的行的B值为true。换句话说，怎么能我按列A的值（无论是true还是false）进行分组，然后在同一个图上绘制两个组的列B的值？这两个数据集的颜色应该不同，以便能够区分这些点。 / p>

接下来，让我们为该程序添加另一个特性：在绘制图形之前，我们要计算每行的另一个值并存储它在D列。这个值是在记录之前的整个五分钟内存储在B中的所有数据的平均值 - 但我们只包括具有存储在A中的相同布尔值的行。换句话说，换句话说，，如果我有一行，其中 A = True 和 time = t ，我想计算一个值对于D列来说，它是从时间 t-5 到 t 的所有记录的B的均值， code> A = True 。

在这种情况下，我们如何执行groupby A，然后将这个计算应用到每个单独的组，然后绘制这两个组的D值？解析方案

我认为@herrfz命中所有的高点。

 导入pandas作为pd 
导入numpy作为np 
导入matplotlib.pyplot为plt 
 
 sin = np.sin 
 cos = np.cos 
 pi = np.pi 
 N = 100 
 
x = np.linspace（0，pi，N）
a = sin（x）
b = cos（x）
 
 df = pd.DataFrame（{
' A'：[True] * N + [False] * N，
'B'：np.hstack（（a，b））
}）
 
 for key， grp in df.groupby（['A']）：
 plt.plot（grp ['B']，label = key）
 grp ['D'] = pd.rolling_mean（grp [ B']，window = 5）
 plt.plot（grp ['D']，label ='rolling（{k}）'。format（k = key））
 plt.legend（loc ='best'）
 plt.show（）

I'm starting to learn Pandas and am trying to find the most Pythonic (or panda-thonic?) ways to do certain tasks.

Suppose we have a DataFrame with columns A, B, and C.

Column A contains boolean values: each row's A value is either true or false.

Column B has some important values we want to plot.

What we want to discover is the subtle distinctions between B values for rows that have A set to false, vs. B values for rows that have A is true.

In other words, how can I group by the value of column A (either true or false), then plot the values of column B for both groups on the same graph? The two datasets should be colored differently to be able to distinguish the points.

Next, let's add another feature to this program: before graphing, we want to compute another value for each row and store it in column D. This value is the mean of all data stored in B for the entire five minutes before a record - but we only include rows that have the same boolean value stored in A.

In other words, if I have a row where A=True and time=t, I want to compute a value for column D that is the mean of B for all records from time t-5 to t that have the same A=True.

In this case, how can we execute the groupby on values of A, then apply this computation to each individual group, and finally plot the D values for the two groups?
解决方案
I think @herrfz hit all the high points. I'll just flesh out the details:
import pandas as pd import numpy as np import matplotlib.pyplot as plt sin = np.sin cos = np.cos pi = np.pi N = 100 x = np.linspace(0, pi, N) a = sin(x) b = cos(x) df = pd.DataFrame({ 'A': [True]*N + [False]*N, 'B': np.hstack((a,b)) }) for key, grp in df.groupby(['A']): plt.plot(grp['B'], label=key) grp['D'] = pd.rolling_mean(grp['B'], window=5) plt.plot(grp['D'], label='rolling ({k})'.format(k=key)) plt.legend(loc='best') plt.show()

这篇关于绘制Pandas Group的结果的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

绘制Pandas Group的结果 [英] Plotting results of Pandas GroupBy

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

绘制Pandas Group的结果 [英] Plotting results of Pandas GroupBy

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭