将 pandas 交叉表转换为堆叠的数据框(常规表) [英] Converting a pandas crosstab into a stacked dataframe (a regular table)
问题描述
给出一个熊猫交叉表,如何将其转换为堆叠的数据框?
Given a pandas crosstab, how do you convert that into a stacked dataframe?
假设您有一个堆叠的数据框.首先,我们将其转换为交叉表.现在,我想恢复为原始的堆叠数据框.我搜索了一个满足此要求的问题陈述,但找不到任何可以解决的问题.如果我错过了任何内容,请在评论部分中留下注释.
Assume you have a stacked dataframe. First we convert it into a crosstab. Now I would like to revert back to the original stacked dataframe. I searched a problem statement that addresses this requirement, but could not find any that hits bang on. In case I have missed any, please leave a note to it in the comment section.
我想在这里记录最佳实践.因此,谢谢您的支持.
I would like to document the best practice here. So, thank you for your support.
我知道 pandas.DataFrame. stack()是最好的方法.但是需要注意应用于级别"堆栈的情况.
I know that pandas.DataFrame.stack() would be the best approach. But one needs to be careful of the the "level" stacking is applied to.
输入:交叉表:
Label a b c d r
ID
1 0 1 0 0 0
2 1 1 0 1 1
3 1 0 0 0 1
4 1 0 0 1 0
6 1 0 0 0 0
7 0 0 1 0 0
8 1 0 1 0 0
9 0 1 0 0 0
输出:堆叠的DataFrame :
ID Label
0 1 b
1 2 a
2 2 b
3 2 d
4 2 r
5 3 a
6 3 r
7 4 a
8 4 d
9 6 a
10 7 c
11 8 a
12 8 c
13 9 b
分步说明:
首先,让我们创建一个可以创建数据的函数.请注意,它随机生成堆叠的数据帧,因此,最终输出可能与我在下面给出的内容有所不同.
Step-by-step Explanation:
First, let's make a function that would create our data. Note that it randomly generates the stacked dataframe, and so, the final output may differ from what I have given below.
帮助器功能:制作堆叠和交叉表数据框
import numpy as np
import pandas as pd
# Make stacked dataframe
def _create_df():
"""
This dataframe will be used to create a crosstab
"""
B = np.array(list('abracadabra'))
A = np.arange(len(B))
AB = list()
for i in range(20):
a = np.random.randint(1,10)
b = np.random.randint(1,10)
AB += [(a,b)]
AB = np.unique(np.array(AB), axis=0)
AB = np.unique(np.array(list(zip(A[AB[:,0]], B[AB[:,1]]))), axis=0)
AB_df = pd.DataFrame({'ID': AB[:,0], 'Label': AB[:,1]})
return AB_df
original_stacked_df = _create_df()
# Make crosstab
crosstab_df = pd.crosstab(original_stacked_df['ID'],
original_stacked_df['Label']).reindex()
会发生什么?
您希望函数能够从交叉表中重新生成堆叠的数据框.我将在答案部分中提供我自己的解决方案.如果您可以提出更好的建议,那就太好了.
What to expect?
You would expect a function to regenerate the stacked dataframe from the crosstab. I would provide my own solution to this in the answer section. If you could suggest something better that would be great.
- 最近的stackoverflow讨论:熊猫堆叠数据框
- 令人误解的stackoverflow问题主题: change pandas crossstab数据帧转换为普通表格式:
- Closest stackoverflow discussion: pandas stacking a dataframe
- Misleading stackoverflow question-topic: change pandas crossstab dataframe into plain table format:
推荐答案
您可以执行stack
df[df.astype(bool)].stack().reset_index().drop(0,1)
这篇关于将 pandas 交叉表转换为堆叠的数据框(常规表)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!