Python中Pivot和Transpose的结合 [英] Combination of Pivot and Transpose in Python

查看：53 发布时间：2021/6/26 19:13:19 python-2.7 pandas

本文介绍了Python中Pivot和Transpose的结合的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在做一些文本分析，并且有一个看起来像这样的数据

I am doing some text analysis, and have a data which kind of looks like this

**TABLE 1**
C1   C2          C3

A1  TEXT1   ANOTHER_TEXT1
A2  TEXT1   ANOTHER_TEXT1
B1  TEXT2   ANOTHER_TEXT1
B2  TEXT2   ANOTHER_TEXT1
B3  TEXT2   ANOTHER_TEXT1
D1  TEXT3   ANOTHER_TEXT2
D2  TEXT3   ANOTHER_TEXT2

我真正需要的是一个在 C2 上聚合的数据集，以及作为不同列的 C1 的内容.本质上，df.transpose 应该做什么.但问题是，如果我转置，它不会聚合 C2 和 C3.

What i really need is a dataset, aggregated over C2, and also the contents of C1 as different columns. Essentially, what a df.transpose is supposed to do. But the problem is that if i transpose, it does not aggregate C2 and C3.

本质上，这就是我正在研究的结构

Essentially, this is the structure i am looking at

**TABLE 2**
 C1              C2    CT1  CT2  CT3

ANOTHER_TEXT1   TEXT1   A1   A2   NA
ANOTHER_TEXT1   TEXT2   B1   B2   B3
ANOTHER_TEXT2   TEXT3   D1   D2   NA

我正在尝试 df.pivot_table(index=['C2','C3'], aggfunc='count')，它给了我出现的次数，这是正确的(显示以下).

I am trying df.pivot_table(index=['C2','C3'], aggfunc='count'), which gives me the count of the occurances, as is correct (Shown Below).

**TABLE 3**
 C1              C2    CT1
ANOTHER_TEXT1   TEXT1   2
                TEXT2   3
ANOTHER_TEXT2   TEXT3   2

那么，我如何在我想要的结构中获得它(表 2)?有可能吗?

So, how do i get it in the structure i want (Table 2)? Is it at all possible?

如果没有，我有什么选择?例如，哪种结构最接近我想要的结构.

If not, what alternatives do i have? As in, which structure would be closest to the one i want.

推荐答案

您可以使用 cumcount 用于新列，然后通过 set_index 和 unstack，最后一个 add_prefix:

You can use cumcount for new columns, then reshape by set_index with unstack, last add_prefix:

df['g'] = df.groupby(['C2','C3']).cumcount() + 1
df = df.set_index(['C2','C3', 'g'])['C1'].unstack().add_prefix('CT').reset_index()
print (df)
      C2             C3 CT1 CT2   CT3
0  TEXT1  ANOTHER_TEXT1  A1  A5    A2
1  TEXT2  ANOTHER_TEXT1  B1  B2    B3
2  TEXT3  ANOTHER_TEXT2  D1  D2  None

groupby 的另一种解决方案，对于新列，使用 Series 构造函数:

Another solution with groupby, for new columns use Series constructor:

df = df.groupby(['C2','C3'])['C1'] \
       .apply(lambda x: pd.Series(x.values)) \
       .unstack() \
       .rename(columns=lambda x: 'CT{}'.format(x+1)) \
       .reset_index()
print (df)
      C2             C3 CT1 CT2   CT3
0  TEXT1  ANOTHER_TEXT1  A1  A5    A2
1  TEXT2  ANOTHER_TEXT1  B1  B2    B3
2  TEXT3  ANOTHER_TEXT2  D1  D2  None

这篇关于Python中Pivot和Transpose的结合的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python中Pivot和Transpose的结合 [英] Combination of Pivot and Transpose in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python中Pivot和Transpose的结合 [英] Combination of Pivot and Transpose in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭