如何使用数据框中的特定行和列在Panda Python中创建直方图 [英] How to create Histograms in Panda Python Using Specific Rows and Columns in Data Frame

查看:97
本文介绍了如何使用数据框中的特定行和列在Panda Python中创建直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在图片中具有以下数据框,我想拍摄一个直方图,以显示任意一年(例如2010年)世界上所有国家的分布情况。

I have the following data frame in the picture, i want to take a Plot a histogram to show the distribution of all countries in the world for any given year (e.g. 2010).

以下是我的代码表在以下清洁代码之后生成的内容:

Following is my code table generates after the following code of cleaning:

dataSheet = pd.read_excel("http://api.worldbank.org/v2/en/indicator/EN.ATM.CO2E.PC?downloadformat=excel",sheetname="Data")
dataSheet = dataSheet.transpose()
dataSheet = dataSheet.drop(dataSheet.columns[[0,1]], axis=1) ;
dataSheet = dataSheet.drop(['World Development Indicators', 'Unnamed: 2','Unnamed: 3'])

推荐答案

某年(例如2010年)所有国家的直方图​​,我将执行以下操作。在您的代码之后:

In order to plot a histogram of all countries for any given year (e.g. 2010), I would do the following. After your code:

dataSheet = pd.read_excel("http://api.worldbank.org/v2/en/indicator/EN.ATM.CO2E.PC?    downloadformat=excel",sheetname="Data")
dataSheet = dataSheet.transpose()
dataSheet = dataSheet.drop(dataSheet.columns[[0,1]], axis=1)
dataSheet = dataSheet.drop(['World Development Indicators', 'Unnamed: 2','Unnamed: 3'])

我将通过将实际的国家/地区名称分配为列名来重新组织列名:

I would reorganise the column names, by assigning the actual country names as column names:

dataSheet.columns = dataSheet.iloc[1] # here I'm assigning the column names
dataSheet = dataSheet.reindex(dataSheet.index.drop('Data Source')) # here I'm re-indexing and getting rid of the duplicate row

然后我将再次转置数据帧(为了安全起见,我将其分配给新变量):

Then I would transpose the data frame again (to be safe I'm assigning it to a new variable):

df = dataSheet.transpose()

然后我将执行与分配之前相同的操作新的列名,因此我们得到了一个不错的数据框(尽管仍然不是最佳),并以国家/地区名称作为索引。

And then I'd do the same as I did before with assigning new column names, so we get a decent data frame (although still not optimal) with country names as index.

df.columns = df.iloc[0]
df = df.reindex(df.index.drop('Country Name'))

现在您终于可以绘制直方图,例如2010年:

Now you can finally plot the histogram for e.g. year 2010:

import matplotlib.pyplot as plt
df[2010].plot(kind='bar', figsize=[30,10])

这篇关于如何使用数据框中的特定行和列在Panda Python中创建直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆