如何过滤由特定列在pandas中创建的交叉表 [英] How to filter a crosstab created in pandas by a specific column

查看：205 发布时间：2018/5/30 14:10:35 python pandas group-by crosstab

本文介绍了如何过滤由特定列在pandas中创建的交叉表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在熊猫中创建了一个交叉列表：

  grouped_missing_analysis = pd.crosstab（clean_sessions.action_type，clean_sessions.action ，margins = True）.unstack（）
 print（grouped_missing_analysis [：20]）

这导致显示：

 操作操作类型
 10缺失0 
未知0 
 booking_request 0 
 booking_response 0 
点击0 
数据0 
 message_post 3215 
修改0 
 partner_callback 0 
提交0 
查看0 
全部3215 
 11遗失0 
未知0 
 booking_request 0 
 booking_response 0 
点击0 
 data 0 
 message_post 716 
修改0 
 dtype：int64

我只想显示'Unknown'，'Missing'或'Other'的 action_type ，并忽略其他 action_type 为每个动作。我有一个感觉，答案是：

  .where（clean_sessions.action_type.isin（（'Missing'，'未知'）），'其他'）

从我以前的代码片断，但我不能让它工作。也许 pivot_table 更简单，这个练习仅供我学习如何在python中使用不同的函数进行数据分析。

clean_sessions 的原始数据如下所示：

user_id操作action_type action_detail \ 0 d1mm9tcy42查找缺少缺失 1 d1mm9tcy42 search_results点击view_search_results 2 d1mm9tcy42查找缺少缺失 3 d1mm9tcy42 search_results点击view_search_results 4 d1mm9tcy42查找缺少缺失 5 d1mm9tcy42 search_results click view_search_results 6 d1mm9tcy42查找缺少缺失的 7 d1mm9tcy42个性化数据wishlist_content_update 8 d1mm9tcy42索引视图view_search_results 9 d1mm9tcy42查找丢失缺失 device_type secs_elapsed 0 Windows桌面319 1 Windows桌面67753 2 Windows桌面301 3 Windows桌面22141 4 Windows桌面435 5 Windows桌面7703 6 Windows桌面115 7 Windows桌面831 8 Windows桌面20842 9 Windows桌面683

解决方案
这些是您的索引而不是列，您需要通过标签来选择您可以为第一级传递切片（无），然后为第二级传递一个列表在[102]中：
grouped_missing_analysis.loc [slice（None），['Missing'，''）未知'，'其他']]

出[102]：
动作动作类型
索引缺少0
lookup Missing 5
personalize Missing 0
search_results Missing 0
All Missing 5
dtype：int64

docs 给出了这种索引风格的更多细节
I have created a cross tabulation in pandas using:
grouped_missing_analysis = pd.crosstab(clean_sessions.action_type, clean_sessions.action, margins=True).unstack() print(grouped_missing_analysis[:20])
Which leads to displaying:
action action_type 10 Missing 0 Unknown 0 booking_request 0 booking_response 0 click 0 data 0 message_post 3215 modify 0 partner_callback 0 submit 0 view 0 All 3215 11 Missing 0 Unknown 0 booking_request 0 booking_response 0 click 0 data 0 message_post 716 modify 0 dtype: int64
I want to only show the action_type which is either 'Unknown', 'Missing' or 'Other', and ignore other action_type for each action. I have a feeling the answer is to do with:
.where(clean_sessions.action_type.isin(('Missing', 'Unknown')), 'Other')
From a previous snippet I have, but I can't get it to work. Maybe pivot_table is easier, this exercise is just for me to learn about how to do data analysis in python with the different functions.

Raw data for clean_sessions looks like:
user_id action action_type action_detail \ 0 d1mm9tcy42 lookup Missing Missing 1 d1mm9tcy42 search_results click view_search_results 2 d1mm9tcy42 lookup Missing Missing 3 d1mm9tcy42 search_results click view_search_results 4 d1mm9tcy42 lookup Missing Missing 5 d1mm9tcy42 search_results click view_search_results 6 d1mm9tcy42 lookup Missing Missing 7 d1mm9tcy42 personalize data wishlist_content_update 8 d1mm9tcy42 index view view_search_results 9 d1mm9tcy42 lookup Missing Missing device_type secs_elapsed 0 Windows Desktop 319 1 Windows Desktop 67753 2 Windows Desktop 301 3 Windows Desktop 22141 4 Windows Desktop 435 5 Windows Desktop 7703 6 Windows Desktop 115 7 Windows Desktop 831 8 Windows Desktop 20842 9 Windows Desktop 683

解决方案
Those are your indices and not columns, you need to pass labels to select the rows of interest.

You can pass slice(None) for the first level and then a list for the second level:
In [102]: grouped_missing_analysis.loc[slice(None), ['Missing', 'Unknown', 'Other']] Out[102]: action action_type index Missing 0 lookup Missing 5 personalize Missing 0 search_results Missing 0 All Missing 5 dtype: int64
The docs give more detail on this style of indexing

这篇关于如何过滤由特定列在pandas中创建的交叉表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何过滤由特定列在pandas中创建的交叉表 [英] How to filter a crosstab created in pandas by a specific column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何过滤由特定列在pandas中创建的交叉表 [英] How to filter a crosstab created in pandas by a specific column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭