将 pandas 数据框字符串条目拆分(分解)为单独的行 [英] Split (explode) pandas dataframe string entry to separate rows

查看：90 发布时间：2020/5/18 18:30:43 python pandas numpy dataframe

本文介绍了将 pandas 数据框字符串条目拆分(分解)为单独的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个pandas dataframe，其中一串文本字符串包含逗号分隔的值.我想拆分每个CSV字段，并为每个条目创建一个新行(假设CSV干净并且只需要在'，'上拆分).例如，a应该变成b:

I have a pandas dataframe in which one column of text strings contains comma-separated values. I want to split each CSV field and create a new row per entry (assume that CSV are clean and need only be split on ','). For example, a should become b:

In [7]: a
Out[7]: 
    var1  var2
0  a,b,c     1
1  d,e,f     2

In [8]: b
Out[8]: 
  var1  var2
0    a     1
1    b     1
2    c     1
3    d     2
4    e     2
5    f     2

到目前为止，我已经尝试了各种简单的函数，但是.apply方法似乎只在轴上使用一行作为返回值，而我却无法使.transform正常工作.任何建议将不胜感激！

So far, I have tried various simple functions, but the .apply method seems to only accept one row as return value when it is used on an axis, and I can't get .transform to work. Any suggestions would be much appreciated!

示例数据:

from pandas import DataFrame
import numpy as np
a = DataFrame([{'var1': 'a,b,c', 'var2': 1},
               {'var1': 'd,e,f', 'var2': 2}])
b = DataFrame([{'var1': 'a', 'var2': 1},
               {'var1': 'b', 'var2': 1},
               {'var1': 'c', 'var2': 1},
               {'var1': 'd', 'var2': 2},
               {'var1': 'e', 'var2': 2},
               {'var1': 'f', 'var2': 2}])

我知道这行不通，因为我们通过numpy丢失了DataFrame元数据，但是它应该使您了解我尝试做的事情:

I know this won't work because we lose DataFrame meta-data by going through numpy, but it should give you a sense of what I tried to do:

def fun(row):
    letters = row['var1']
    letters = letters.split(',')
    out = np.array([row] * len(letters))
    out['var1'] = letters
a['idx'] = range(a.shape[0])
z = a.groupby('idx')
z.transform(fun)

将 pandas 数据框字符串条目拆分(分解)为单独的行 [英] Split (explode) pandas dataframe string entry to separate rows

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将 pandas 数据框字符串条目拆分(分解)为单独的行 [英] Split (explode) pandas dataframe string entry to separate rows

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭