你怎么过滤多个列的 pandas 数据框 [英] how do you filter pandas dataframes by multiple columns
问题描述
为了过滤一个数据框(df) males = df [df [Gender] =='Male']
问题1 - 数据跨越多年,我只想看到2014年的男性?
在其他语言,我可能会做这样的事情:
如果A =男,如果B =2014,那么
(除了我想这样做,并获得一个新的数据框对象的原始数据框的一个子集)
问题2.我如何做到这一点一个循环,并为每一个独特的年份和性别设置一个数据框对象(例如,2013年的男性,2013年的女性,2014年的男性和2014年的女性,bf b
for y年:
for g性别:
df = .....
&
运算符,不要忘记用来包装子语句( )
: males = df [(df [Gender] =='Male')& (df [Year] == 2014)]
要将数据框存储在 dict
使用for循环:
来自集合import defaultdict
dic = {}
for ['男','女']:
dic [g] = defaultdict(dict)
在[2013,2014]中为y:
dic [g ] [y] = df [(df [Gender] == g)& (df [Year] == y)]#将DataFrames存储为字典字典
编辑:
$ b $ <
getDF
的演示: def getDF(dic,gender,year):
return dic [gender] [year]
print genDF(dic,'male',2014)
To filter a dataframe (df) by a single column, if we consider data with male and females we might:
males = df[df[Gender]=='Male']
Question 1 - But what if the data spanned multiple years and i wanted to only see males for 2014?
In other languages I might do something like:
if A = "Male" and if B = "2014" then
(except I want to do this and get a subset of the original dataframe in a new dataframe object)
Question 2. How do I do this in a loop, and create a dataframe object for each unique sets of year and gender (i.e. a df for: 2013-Male, 2013-Female, 2014-Male, and 2014-Female
for y in year:
for g in gender:
df = .....
Using &
operator, don't forget to wrap the sub-statements with ()
:
males = df[(df[Gender]=='Male') & (df[Year]==2014)]
To store your dataframes in a dict
using a for loop:
from collections import defaultdict
dic={}
for g in ['male', 'female']:
dic[g]=defaultdict(dict)
for y in [2013, 2014]:
dic[g][y]=df[(df[Gender]==g) & (df[Year]==y)] #store the DataFrames to a dict of dict
EDIT:
A demo for your getDF
:
def getDF(dic, gender, year):
return dic[gender][year]
print genDF(dic, 'male', 2014)
这篇关于你怎么过滤多个列的 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!