Python Pandas 将一组初始列融合为多个目标列 [英] Python Pandas Melt Groups of Initial Columns Into Multiple Target Columns
问题描述
我需要将一组初始列融合到未规范化的数据集中的多个目标列中.这是一个例子(来自这个问题 pandas dataframe将多个值变量重塑/堆叠到单独的列中):
I have a need to melt groups of initial columns into multiple target columns in a dataset that is not normalized well. Here is an example (from this question pandas dataframe reshaping/stacking of multiple value variables into seperate columns):
des1 des2 des3 interval1 interval2 interval3
value
aaa a b c ##1 ##2 ##3
bbb d e f ##4 ##5 ##6
ccc g h i ##7 ##8 ##9
我正在尝试将其融入这样的方向:
I am trying to melt this into something like this orientation:
des interval
value
aaa a ##1
aaa b ##2
aaa c ##3
bbb d ##4
bbb e ##5
bbb f ##6
ccc g ##7
ccc h ##8
ccc i ##9
我希望使用melt而不是stack来避免手动设置大量数据的子集.以下是我迄今为止的开始:
I was hoping to use melt instead of stack to avoid manually subsetting a lot of data. Here is what I have started out with thus far:
import pandas as pd
import numpy as np
import fnmatch
column_list = list(df_initial.columns.values)
question_sources = [c for c in fnmatch.filter(column_list, "measure*question*source")]
question_ranks = [c for c in fnmatch.filter(column_list, "measure*rank")]
question_targets = [c for c in fnmatch.filter(column_list, "measure*targeted")]
question_statuses = [c for c in fnmatch.filter(column_list, "measure*status")]
place = [c for c in fnmatch.filter(column_list, "place")]
measure_statuses = [c for c in fnmatch.filter(column_list, "measureInfo_status")]
starter_list = place + measure_statuses
df_gpro_melt_1 = (pd.melt(df_initial, id_vars=starter_list,
value_vars=question_sources, var_name="question_sources",
value_name="question_sources_values"))
是否可以将初始列的组融合为多个目标列?非常感谢任何建议.
Is it possible to melt groups of initial columns into multiple target columns? Any advice is much appreciated.
推荐答案
这应该适用于您的示例,如果您的列遵循示例数据框中的模式:
This should work for your example, if your columns follow the pattern in your example data frame:
pd.concat((pd.DataFrame({'des':df.iloc[:,i],
'interval':df.iloc[:,i+3]})
for i in range(3)))
如果对不同,你可以使用这个模式,但是遍历一个列表
If the pairs are different, you can use this pattern, but iterate through a list
tuples = [(0,3),(1,4),(2,5)]
pd.concat((pd.DataFrame({'des':df.iloc[:,i],
'interval':df.iloc[:,j]})
for i,j in tuples))
这篇关于Python Pandas 将一组初始列融合为多个目标列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!