如何从列类型列表中删除 Pandas DataFrame 中的空值 [英] How to remove empty values from the pandas DataFrame from a column type list

查看:113
本文介绍了如何从列类型列表中删除 Pandas DataFrame 中的空值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

只是期待一种解决方案,从具有作为列表的值的列中删除空值,在某种意义上,我们已经预先替换了一些字符串,其中它是列表的字符串表示形式的列.

df.color 中,我们只是将 *._Blue 替换为空字符串:

示例数据帧:

df = pd.DataFrame({ 'Bird': ["parrot", "Eagle", "Seagull"], 'color': [ "['Light_Blue','Green','Dark_Blue']", "['Sky_Blue','Black','White','Yellow','Gray']", "['White','Jet_Blue','Pink','Tan','Brown','Purple']"] })>>>df鸟色0 鹦鹉 ['Light_Blue','Green','Dark_Blue']1 鹰 ['Sky_Blue','Black','White','Yellow','Grey']2 海鸥 ['White','Jet_Blue','Pink','Tan','Brown','Pu...

上述 DF 的结果:

<预><代码>>>>df['color'].str.replace(r'\w+_Blue\b', '')0 ['','绿色','']1 ['','黑色','白色','黄色','灰色']2 ['白色'、''、'粉红色'、'棕褐色'、'棕色'、'紫色']名称:颜色,数据类型:对象

通常在python中很容易完成如下..

<预><代码>>>>lst = ['','绿色','']>>>[x for x in lst if x]['绿色的']

恐怕如果像下面这样可以做到.

df.color.mask(df == ' ')

解决方案

您没有一列列表,而是有一列包含列表的字符串表示形式.您可以使用 ast.literal_evalstr.endswith 一步完成这一切.我会在这里使用列表理解,它应该比 apply

更快<小时>

导入 ast固定 = [[el for el in lst if not el.endswith("Blue")]对于 df['color'].apply(ast.literal_eval) 中的 lst]df.assign(颜色=固定)

 鸟的颜色0鹦鹉【绿色】1只鹰【黑、白、黄、灰】2海鸥【白、粉、棕、棕、紫】

Just looking forward a solution to remove empty values from a column which has values as a list in a sense where we are already replacing some strings beforehand, where it's a column of string representation of lists.

In df.color we are Just replacing *._Blue with empty string:

Example DataFrame:

df = pd.DataFrame({ 'Bird': ["parrot", "Eagle", "Seagull"], 'color': [ "['Light_Blue','Green','Dark_Blue']", "['Sky_Blue','Black','White', 'Yellow','Gray']", "['White','Jet_Blue','Pink', 'Tan','Brown', 'Purple']"] })

>>> df
      Bird                                              color
0   parrot                 ['Light_Blue','Green','Dark_Blue']
1    Eagle      ['Sky_Blue','Black','White', 'Yellow','Gray']
2  Seagull  ['White','Jet_Blue','Pink', 'Tan','Brown', 'Pu...

Result of above DF:

>>> df['color'].str.replace(r'\w+_Blue\b', '')
0                                 ['','Green','']
1           ['','Black','White', 'Yellow','Gray']
2    ['White','','Pink', 'Tan','Brown', 'Purple']
Name: color, dtype: object

Usually in python it easily been done as follows..

>>> lst = ['','Green','']
>>> [x for x in lst if x]
['Green']

I'm afraid if something like below can be done.

df.color.mask(df == ' ')

解决方案

You don't have a column of lists, you have a column that contains string representation of lists. You can do this all in a single step using ast.literal_eval and str.endswith. I would use a list-comprehension here which should be faster than apply


import ast

fixed = [
    [el for el in lst if not el.endswith("Blue")]
    for lst in df['color'].apply(ast.literal_eval)
]

df.assign(color=fixed)

      Bird                              color
0   parrot                            [Green]
1    Eagle       [Black, White, Yellow, Gray]
2  Seagull  [White, Pink, Tan, Brown, Purple]

这篇关于如何从列类型列表中删除 Pandas DataFrame 中的空值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆