Seaborn groupby pandas 系列 [英] Seaborn groupby pandas Series

查看:45
本文介绍了Seaborn groupby pandas 系列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将我的数据可视化为箱线图,这些箱线图由我可怕的绘图中显示的另一个变量分组:

I want to visualize my data into box plots that are grouped by another variable shown here in my terrible drawing:

所以我所做的是使用一个pandas系列变量来告诉pandas我已经对变量进行了分组,所以这就是我所做的:

So what I do is to use a pandas series variable to tell pandas that I have grouped variables so this is what I do:

import pandas as pd
import seaborn as sns
#example data for reproduciblity
a = pd.DataFrame(
[
[2, 1],
[4, 2],
[5, 1],
[10, 2],
[9, 2],
[3, 1]
])

#converting second column to Series 
a.ix[:,1] = pd.Series(a.ix[:,1])
#Plotting by seaborn
sns.boxplot(a, groupby=a.ix[:,1])

这就是我得到的:

然而,我希望得到的是有两个箱线图,每个箱线图只描述第一列,按第二列中的相应列(转换为系列的列)分组,而上面的图分别显示每一列,其中不是我想要的.

However, what I would have expected to get was to have two boxplots each describing only the first column, grouped by their corresponding column in the second column (the column converted to Series), while the above plot shows each column separately which is not what I want.

推荐答案

Dataframe 中的列已经是 Series,因此无需进行转换.此外,如果您只想将第一列用于两个箱线图,您应该只将其传递给 Seaborn.

A column in a Dataframe is already a Series, so your conversion is not necessary. Furthermore, if you only want to use the first column for both boxplots, you should only pass that to Seaborn.

所以:

#example data for reproduciblity
df = pd.DataFrame(
[
[2, 1],
[4, 2],
[5, 1],
[10, 2],
[9, 2],
[3, 1]
], columns=['a', 'b'])

#Plotting by seaborn
sns.boxplot(df.a, groupby=df.b)

我稍微改变了你的例子,给列一个标签在我看来更清楚一点.

I changed your example a little bit, giving columns a label makes it a bit more clear in my opinion.

如果您想分别绘制所有列,您(我认为)基本上需要 groupby 列和任何其他列中的值的所有组合.所以如果你 Dataframe 看起来像这样:

If you want to plot all columns separately you (i think) basically want all combinations of the values in your groupby column and any other column. So if you Dataframe looks like this:

    a   b  grouper
0   2   5        1
1   4   9        2
2   5   3        1
3  10   6        2
4   9   7        2
5   3  11        1

并且您需要 ab 列的箱线图,同时按 grouper 列分组.您应该展平列并将 groupby 列更改为包含 a1a2b1 等值.

And you want boxplots for columns a and b while grouped by the column grouper. You should flatten the columns and change the groupby column to contain values like a1, a2, b1 etc.

鉴于上面显示的数据框,这是我认为应该工作的粗略方法:

Here is a crude way which i think should work, given the Dataframe shown above:

dfpiv = df.pivot(index=df.index, columns='grouper')

cols_flat = [dfpiv.columns.levels[0][i] + str(dfpiv.columns.levels[1][j]) for i, j in zip(dfpiv.columns.labels[0], dfpiv.columns.labels[1])]  
dfpiv.columns = cols_flat
dfpiv = dfpiv.stack(0)

sns.boxplot(dfpiv, groupby=dfpiv.index.get_level_values(1))

也许有更多奇特的方式来重构Dataframe.尤其是pivoting后的层级扁平化很难看,我不喜欢.

Perhaps there are more fancy ways of restructuring the Dataframe. Especially the flattening of the hierarchy after pivoting is hard to read, i dont like it.

这篇关于Seaborn groupby pandas 系列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆