Pandas wide_to_long的行为与Python 2.7中的先前版本有很大不同 [英] Pandas wide_to_long behaves very differently against the previous version in Python 2.7

查看:525
本文介绍了Pandas wide_to_long的行为与Python 2.7中的先前版本有很大不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我升级了笔记本电脑,并在Pandas 0.23上安装了Python 2.7.由于结果不尽相同,我以前所有可用的脚本都不再运行.

I upgraded my laptop and installed Python 2.7 with Pandas 0.23. All my previously working scripts do not run anymore because of wide_to_long different results.

数据框如下:

Index    ID     Text_column     abc_A   abc_B   abc_C   abc_D
0       123     some text       True    False   False   True
1       124     another text    False   True    False   False
2       125     different topic False   True    True    False
3       126     set of words    False   False   False   False

使用以下代码:

    df2=pd.wide_to_long(df1,['abc_'], i='ID', j='Concept').reset_index()

用于给我以下数据框d2:

used to give me the following dataframe d2:

Index    ID     Concept Text_column     abc_
0        123     A      some text       True
1        123     B      some text       False
2        123     C      some text       False
3        123     D      some text       True
4        124     A      another text    False
5        124     B      another text    True
6        124     C      another text    False
7        124     D      another text    False
8        125     A      different topic False
9        125     B      different topic True
10       125     C      different topic True
11       125     D      different topic False
12       126     A      set of words    False
13       126     B      set of words    False
14       126     C      set of words    False
15       126     D      set of words    False

在0.23版本中,我得到了完全空的数据框,例如:

With the version 0.23 I get completely empty data frame like:

Index ID  Concept Text_column abc_A abc_B abc_C abc_D

我尝试了融化,但是我不想在value_vars中指定所有列,例如[abc_A,abc_B等],因为不同项目的变量名非常不同,并且我有许多这样的脚本.

I tried melt, but I don't want to specify all columns like [abc_A, abc_B etc.] in value_vars as the variable names are very different for different projects and I have many scripts like that.

解决此问题的最佳方法是什么?

What would be the best fix for this issue?

非常感谢您!

推荐答案

添加suffix='\w+'的默认值为suffix='\d+',即数字

Adding suffix='\w+' the default is suffix='\d+' which is number

pd.wide_to_long(df, ['abc_'], i='ID', j='Concept',suffix='\w+').reset_index()
Out[243]: 
     ID Concept  Index     Text_column   abc_
0   123       A      0        sometext   True
1   124       A      1     anothertext  False
2   125       A      2  differenttopic  False
3   126       A      3      setofwords  False
4   123       B      0        sometext  False
5   124       B      1     anothertext   True
6   125       B      2  differenttopic   True
7   126       B      3      setofwords  False
8   123       C      0        sometext  False
9   124       C      1     anothertext  False
10  125       C      2  differenttopic   True
11  126       C      3      setofwords  False
12  123       D      0        sometext   True
13  124       D      1     anothertext  False
14  125       D      2  differenttopic  False
15  126       D      3      setofwords  False

这篇关于Pandas wide_to_long的行为与Python 2.7中的先前版本有很大不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆