如何通过在python中的某些列上应用条件来过滤csv数据 [英] How to filter csv data by applying conditions on certain columns in python
问题描述
我是新的python数据分析人员,在以特定格式获取所需数据时遇到一些问题.
I am new python data analysis and having some problems to get the required data in specific format.
我的数据采用以下格式. (由于数据量很大,请检查附件链接是否有csv格式的数据)
My data is in following format. ( please check the attached link for data in csv format as the data is quite large)
我使用以下命令以上述格式打印csv数据
I used following commands to print the csv data in the above format
address = 'C:\Barchatdata.csv'
data_c = pd.read_csv(address)
address = 'C:\Barchatdata.csv'
data_c = pd.read_csv(address)
现在我想在Energy_Supply_per_capita> 280上应用条件,然后打印索引列,contry_area,Energy_Supply_per_capita和Avg_GDP列.
Now i want to apply if condition on Energy_Supply_per_capita >280 and then print index column, contry_area, Energy_Supply_per_capita and Avg_GDP columns.
我尝试了以下命令
data_c.loc[data_c['Energy_Supply_per_capita'] > 280, 'Energy_Supply_per_capita']
但是只有index和Energy_Supply_per_capita列.
but got only index and Energy_Supply_per_capita columns.
我如何获得所需的结果?
How i can get the required results?
谢谢.
推荐答案
您可以使用query
cols = ['Country_Area', 'Energy_Supply_per_capita', 'Avg_GDP']
data_c.query('Energy_Supply_per_capita > 280')[cols]
或等效地具有布尔级数和loc
Or equivalently with a boolean series and loc
cols = ['Country_Area', 'Energy_Supply_per_capita', 'Avg_GDP']
data_c.loc[data_c.Energy_Supply_per_capita > 280, cols]
这篇关于如何通过在python中的某些列上应用条件来过滤csv数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!