从数据透视表绘制Pandas DataFrame [英] Plotting Pandas DataFrame from Pivot table

查看:81
本文介绍了从数据透视表绘制Pandas DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图绘制一条折线图,比较使用Jupyter笔记本中的熊猫在1960-1962年间特定国家的谋杀率.

关于我现在的位置以及如何到达这里的一些背景信息:

我正在使用犯罪csv文件,该文件如下所示:

我目前仅对3栏感兴趣:州,年份和谋杀率.具体来说,我只对5个州感兴趣-阿拉斯加,密歇根州,明尼苏达州,缅因州,威斯康星州.

所以要生成所需的表,我做了这个(仅显示前5行条目):

  al_mi_mn_me_wi =犯罪[(crimes ['State'] =='Alaska')|(crimes ['State'] =='Michigan')|(crimes ['State'] =='Minnesota')|(crimes ['State'] =='Maine')|(crimes ['State'] =='Wisconsin')]control_df = al_mi_mn_me_wi [[州",年",谋杀率"]] 

在这里,我使用了数据透视功能

  df = control_1960_to_1962.pivot(index ='Year',列='State',values ='Murder Rate') 

这就是我卡住的地方.我在执行操作时收到KeyError(KeyError是Year):

  df.plot(x ='Year',y ='Murder Rate',kind ='line') 

以及仅尝试

  df.plot() 

我得到了这个古怪的图.

如何获取所需图形?

解决方案

设置

 将numpy导入为np将熊猫作为pd导入control_1960_to_1962 = pd.DataFrame({'状态':np.repeat(['阿拉斯加','缅因州,'密歇根州,'明尼苏达州,'威斯康星州]],3),年份":[1960,1961,1962] * 5,谋杀率":[10.2、11.5、4.5、1.7、1.6、1.4、4.5、4.1、3.4、1.2、1.0,.9、1.3、1.6,.9]})df = control_1960_to_1962.pivot(index ='Year',column ='State',values ='Murder Rate') 

情节

您可以告诉Pandas(并通过它实际执行绘图的 matplotlib 包)明确想要什么xticks:

  ax = df.plot(xticks = df.index)ylab = ax.set_ylabel('谋杀率') 

输出:

ax

I am trying to plot a line graph comparing the Murder Rates of particular States through the years 1960-1962 using Pandas in a Jupyter Notebook.

A little context about where I am now, and how I arrived here:

I'm using a crime csv file, which looks like this:

I'm only interested in 3 columns for the time being: State, Year, and Murder Rate. Specifically I was interested in only 5 states - Alaska, Michigan, Minnesota, Maine, Wisconsin.

So to produce the desired table, I did this (only showing top 5 row entries):

al_mi_mn_me_wi = crimes[(crimes['State'] == 'Alaska') | (crimes['State'] =='Michigan') | (crimes['State'] =='Minnesota') | (crimes['State'] =='Maine') | (crimes['State'] =='Wisconsin')]
control_df = al_mi_mn_me_wi[['State', 'Year', 'Murder Rate']]

From here I used the pivot function

df = control_1960_to_1962.pivot(index = 'Year', columns = 'State',values= 'Murder Rate' ) 

And this is where I get stuck. I received KeyError when doing (KeyError was Year):

df.plot(x='Year', y='Murder Rate', kind='line')

and when attempting just

df.plot()

I get this wonky graph.

How do I get my desired graph?

解决方案

Setup

import numpy as np
import pandas as pd

control_1960_to_1962 = pd.DataFrame({
    'State': np.repeat(['Alaska', 'Maine', 'Michigan', 'Minnesota', 'Wisconsin'], 3),
    'Year': [1960, 1961, 1962]*5,
    'Murder Rate': [10.2, 11.5, 4.5, 1.7, 1.6, 1.4, 4.5, 4.1, 3.4, 1.2, 1.0, .9, 1.3, 1.6, .9]
})

df = control_1960_to_1962.pivot(index='Year', columns='State', values='Murder Rate')

The plots

You can tell Pandas (and through it the matplotlib package that actually does the plotting) what xticks you want explicitly:

ax = df.plot(xticks=df.index)
ylab = ax.set_ylabel('Murder Rate')

Output:

ax is a matplotlib.axes.Axes object, and there are many, many customizations you can make to your plot through it.

Here's how to plot with the States on the x axis:

ax = df.T.plot(kind='bar')
ylab = ax.set_ylabel('Murder Rate')

Output:

这篇关于从数据透视表绘制Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆