从数据透视表绘制Pandas DataFrame [英] Plotting Pandas DataFrame from Pivot table
问题描述
我试图绘制一条折线图,比较使用Jupyter笔记本中的熊猫在1960-1962年间特定国家的谋杀率.
关于我现在的位置以及如何到达这里的一些背景信息:
我正在使用犯罪csv文件,该文件如下所示:
我目前仅对3栏感兴趣:州,年份和谋杀率.具体来说,我只对5个州感兴趣-阿拉斯加,密歇根州,明尼苏达州,缅因州,威斯康星州.
所以要生成所需的表,我做了这个(仅显示前5行条目):
al_mi_mn_me_wi =犯罪[(crimes ['State'] =='Alaska')|(crimes ['State'] =='Michigan')|(crimes ['State'] =='Minnesota')|(crimes ['State'] =='Maine')|(crimes ['State'] =='Wisconsin')]control_df = al_mi_mn_me_wi [[州",年",谋杀率"]]
在这里,我使用了数据透视功能
df = control_1960_to_1962.pivot(index ='Year',列='State',values ='Murder Rate')
这就是我卡住的地方.我在执行操作时收到KeyError(KeyError是Year):
df.plot(x ='Year',y ='Murder Rate',kind ='line')
以及仅尝试
时 df.plot()
我得到了这个古怪的图.
如何获取所需图形?
设置
将numpy导入为np将熊猫作为pd导入control_1960_to_1962 = pd.DataFrame({'状态':np.repeat(['阿拉斯加','缅因州,'密歇根州,'明尼苏达州,'威斯康星州]],3),年份":[1960,1961,1962] * 5,谋杀率":[10.2、11.5、4.5、1.7、1.6、1.4、4.5、4.1、3.4、1.2、1.0,.9、1.3、1.6,.9]})df = control_1960_to_1962.pivot(index ='Year',column ='State',values ='Murder Rate')
情节
您可以告诉Pandas(并通过它实际执行绘图的 matplotlib
包)明确想要什么xticks:
ax = df.plot(xticks = df.index)ylab = ax.set_ylabel('谋杀率')
输出:
ax
是
I am trying to plot a line graph comparing the Murder Rates of particular States through the years 1960-1962 using Pandas in a Jupyter Notebook.
A little context about where I am now, and how I arrived here:
I'm using a crime csv file, which looks like this:
I'm only interested in 3 columns for the time being: State, Year, and Murder Rate. Specifically I was interested in only 5 states - Alaska, Michigan, Minnesota, Maine, Wisconsin.
So to produce the desired table, I did this (only showing top 5 row entries):
al_mi_mn_me_wi = crimes[(crimes['State'] == 'Alaska') | (crimes['State'] =='Michigan') | (crimes['State'] =='Minnesota') | (crimes['State'] =='Maine') | (crimes['State'] =='Wisconsin')]
control_df = al_mi_mn_me_wi[['State', 'Year', 'Murder Rate']]
From here I used the pivot function
df = control_1960_to_1962.pivot(index = 'Year', columns = 'State',values= 'Murder Rate' )
And this is where I get stuck. I received KeyError when doing (KeyError was Year):
df.plot(x='Year', y='Murder Rate', kind='line')
and when attempting just
df.plot()
I get this wonky graph.
How do I get my desired graph?
Setup
import numpy as np
import pandas as pd
control_1960_to_1962 = pd.DataFrame({
'State': np.repeat(['Alaska', 'Maine', 'Michigan', 'Minnesota', 'Wisconsin'], 3),
'Year': [1960, 1961, 1962]*5,
'Murder Rate': [10.2, 11.5, 4.5, 1.7, 1.6, 1.4, 4.5, 4.1, 3.4, 1.2, 1.0, .9, 1.3, 1.6, .9]
})
df = control_1960_to_1962.pivot(index='Year', columns='State', values='Murder Rate')
The plots
You can tell Pandas (and through it the matplotlib
package that actually does the plotting) what xticks you want explicitly:
ax = df.plot(xticks=df.index)
ylab = ax.set_ylabel('Murder Rate')
Output:
ax
is a matplotlib.axes.Axes
object, and there are many, many customizations you can make to your plot through it.
Here's how to plot with the States
on the x axis:
ax = df.T.plot(kind='bar')
ylab = ax.set_ylabel('Murder Rate')
Output:
这篇关于从数据透视表绘制Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!