按列值将几行连接成一行,并根据连接的行数将结果数据帧拆分为多个数据帧 [英] Concatenate several rows into one row by column value, and split resulting dataframe into several dataframes based on number of concatinated rows
问题描述
我有一个 Pandas 数据框,其中几行共享特定的列值.对于这些行,我想将这些行连接成一行.并且具有共享列值的行数存在变化,我想将这些数据帧拆分为它们自己单独的数据帧,因此对于特定数量的共享行是唯一的数据帧.
I have a pandas dataframe where several rows share the particular column value. For those rows, I want to concatenate these rows into a single row. And single there are variations for the number of rows with the shared column value, I want to split these dataframes into their own separate dataframe, so a unique dataframe for a particular number of shared rows.
这是我想要的示例.
import pandas as pd
data = [['tom', 2], ['ni2ck', 2], ['j3uli', 4] , ['nic4k', 4], ['jul5i', 4] , ['nic6k', 7], ['ju7li', 7] , ['nic8k', 7], ['ju9li', 7] , ['nic1k', 8], ['car', 8]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])
df
上面的代码生成原始数据帧的样子
The code above generates what the original dataframe would look like
结果是
Name Age
0 tom 2
1 ni2ck 2
2 j3uli 4
3 nic4k 4
4 jul5i 4
5 nic6k 7
6 ju7li 7
7 nic8k 7
8 ju9li 7
9 nic1k 8
10 car 8
我想将共享相同 Age 列的所有行放在一行中,然后根据为每个共享行生成的列数分隔数据框.所以结果看起来像这样
I want to put all rows the share the same Age column into one row, and then separate the dataframe based on the number of columns that are generated for each shared row. So the result would look like this
第一个结果数据框,它有两行,因为有两行共享相同的列数.
first result dataframe, which has two rows, since there are two lines that share the same number of columns.
Name Name Age
0 tom ni2ck 2
1 nic1k car 8
第二个结果数据框
Name Name Name Age
0 j3uli nic4k jul5i 4
第三个结果数据框
Name Name Name Name Age
0 nic6k ju7li nic8k ju9li 7
推荐答案
这是一种方法
df['New']=df.groupby('Age').cumcount()
s=df.pivot(index='Age',columns='New',values='Name')
l=[ y.dropna(1) for _ , y in s.groupby(s.isnull().sum(1))]
l[0]
New 0 1 2 3
Age
7 nic6k ju7li nic8k ju9li
l[1]
New 0 1 2
Age
4 j3uli nic4k jul5i
l[2]
New 0 1
Age
2 tom ni2ck
8 nic1k car
这篇关于按列值将几行连接成一行,并根据连接的行数将结果数据帧拆分为多个数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!