Python根据双索引中的星期几创建伪变量 [英] Python Create dummy variables based on day of week in double index

查看:78
本文介绍了Python根据双索引中的星期几创建伪变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有双索引(日期,时间)的数据框,如果索引日在正确的日期,我想创建等于一的新列星期一,星期二,星期三等。



我的原始数据帧:

 访问者
日期时间
2017-09-11 4:45 0
5:00 1
5:15 26
....
2017-09-12 4:45 0
5:00 1
5:15 26
....

我想要的东西

 访客星期一星期二
日期时间
2017-09-11 4:45 0 1 0
5:00 1 1 0
5:15 26 1 0
....
2017-09-12 4:45 0 0 1
5:00 1 0 1
5:15 26 0 1
。 ...

这是我尝试的内容:

  df ['Monday'] =(df.index.get_level_values(0).weekday()== 0)

但是我收到一条错误消息,提示'Int64Index'对象不可调用。



在此先感谢您!

解决方案

您需要从:

$中删除() b
$ b

  df ['Monday'] =(df.index.get_level_values(0).weekday == 0).astype(int)

打印(df)
访客星期一
日期时间
2017-09-11 4:45 0 1
5:00 1 1
5:15 26 1
2017-09-12 4:45 0 0
5:00 1 0
5:15 26 0






 名称= [['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday'] 

for i,x inumerate(names):
df [x] =(df.index.get_level_values(0).weekday == i).astype(int)
打印(df)
访客星期一星期二星期三星期四星期五\
日期时间
2017-09-11 4:45 0 1 0 0 0 0
5:00 1 1 0 0 0 0
5:15 26 1 0 0 0 0
2017-09-12 4:45 0 0 1 0 0 0
5:00 1 0 1 0 0 0
5:15 26 0 1 0 0 0

星期六星期日
日期时间
2017-09-11 4:45 0 0
5:00 0 0
5:15 0 0
2017-09-12 4:45 0 0
5:00 0 0
5:15 0 0

另一个解决方案是改进的另一个原始答案-需要 DatetimeIndex.weekday_name get_dummies ,然后是 set_index ,如果原始索引对于添加missig名称,必须添加reindex:

  names = ['Monday','Tuesday' ,星期三,星期四,星期五,星期六,星期日] 

df1 = df.join(pd.get_dummies(df.index.get_level_values(0).weekday_name)
.set_index(df.index).reindex(columns = names,fill_value = 0))
打印(df1)
访客星期一星期二星期三星期四星期五\
日期时间
2017-09-11 4:45 0 1 0 0 0 0
5:00 1 1 0 0 0 0
5:15 26 1 0 0 0 0
2017-09 -12 4:45 0 0 1 0 0 0
5:00 1 0 1 0 0 0
5:15 26 0 1 0 0 0

星期六星期日
日期时间
2017-09-11 4:45 0 0
5:00 0 0
5:15 0 0
2017-09-12 4:45 0 0
5:00 0 0
5:15 0 0


I have a dataframe with a double index (day, time) and would like to create new columns 'Monday', 'Tuesday', 'Wednesday' etc equal to one if the index day is in the correct day.

My original dataframe:

                       Visitor  
Date       Time                                                              
2017-09-11 4:45           0         
           5:00           1        
           5:15          26       
....
2017-09-12 4:45           0       
           5:00           1         
           5:15          26     
....

What I would like to have:

                       Visitor      Monday    Tuesday
Date       Time                                                              
2017-09-11 4:45           0           1          0
           5:00           1           1          0
           5:15          26           1          0
....
2017-09-12 4:45           0           0          1
           5:00           1           0          1
           5:15          26           0          1
....

Here is what I tried:

df['Monday'] = (df.index.get_level_values(0).weekday() == 0)

However I get an error saying "'Int64Index' object is not callable".

Thanks in advance!

解决方案

You need remove () from :

df['Monday'] = (df.index.get_level_values(0).weekday == 0).astype(int)

print (df)
                 Visitor  Monday
Date       Time                 
2017-09-11 4:45        0       1
           5:00        1       1
           5:15       26       1
2017-09-12 4:45        0       0
           5:00        1       0
           5:15       26       0


names = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

for i, x in enumerate(names):
    df[x] = (df.index.get_level_values(0).weekday == i).astype(int)
print (df)
                 Visitor  Monday  Tuesday  Wednesday  Thursday  Friday  \
Date       Time                                                          
2017-09-11 4:45        0       1        0          0         0       0   
           5:00        1       1        0          0         0       0   
           5:15       26       1        0          0         0       0   
2017-09-12 4:45        0       0        1          0         0       0   
           5:00        1       0        1          0         0       0   
           5:15       26       0        1          0         0       0   

                 Saturday  Sunday  
Date       Time                    
2017-09-11 4:45         0       0  
           5:00         0       0  
           5:15         0       0  
2017-09-12 4:45         0       0  
           5:00         0       0  
           5:15         0       0  

Another solution is a improved another original answer - need DatetimeIndex.weekday_name with get_dummies, then set_index by original index and if necessary add reindex for add missig names:

names = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

df1 = df.join(pd.get_dummies(df.index.get_level_values(0).weekday_name)
                .set_index(df.index).reindex(columns=names, fill_value=0))
print (df1)
                 Visitor  Monday  Tuesday  Wednesday  Thursday  Friday  \
Date       Time                                                          
2017-09-11 4:45        0       1        0          0         0       0   
           5:00        1       1        0          0         0       0   
           5:15       26       1        0          0         0       0   
2017-09-12 4:45        0       0        1          0         0       0   
           5:00        1       0        1          0         0       0   
           5:15       26       0        1          0         0       0   

                 Saturday  Sunday  
Date       Time                    
2017-09-11 4:45         0       0  
           5:00         0       0  
           5:15         0       0  
2017-09-12 4:45         0       0  
           5:00         0       0  
           5:15         0       0  

这篇关于Python根据双索引中的星期几创建伪变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆