在列表Python上使用Apply时出错 [英] Getting error while using apply on a list python

查看:354
本文介绍了在列表Python上使用Apply时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中txt列包含一个列表.我想使用函数clean_text()清理txt列.

I have data frame in which txt column contains a list. I want to clean the txt column using function clean_text().

data = {'value':['abc.txt', 'cda.txt'], 'txt':['['2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart']',
                                               '['2019/02/01-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart']']}
df = pandas.DataFrame(data=data)
    df
 value    txt
 abc.txt  ['2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart']
 cda.txt  ['2019/02/01-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart']
def clean_text(text):
    """
    :param text:  it is the plain text
    :return: cleaned text
    """
    patterns = [r"^.{53}",
                r"[A-Za-z]+[\d]+[\w]*|[\d]+[A-Za-z]+[\w]*",
                r"[-=/':,?${}\[\]-_()>.~" ";+]"]

    for p in patterns:
        text = re.sub(p, '', text)

    return text

我的解决方案:

df['txt'] = df['txt'].apply(lambda x: clean_text(x))

但是我遇到了以下错误: 错误

But I am getting below error: Error

df['txt'] = df['txt'].apply(lambda x: clean_text(x))
AttributeError: 'list' object has no attribute 'apply'



clean_text(df['txt'][1]
TypeError: expected string or bytes-like object

我不确定在此问题中如何使用numpy.where.

I am not sure how to use numpy.where in this problem.

推荐答案

基于对您的问题的修订以及注释中的讨论,我相信您需要使用以下行:

Based on the revision to your question, and discussion in the comments, I believe you need to use the following line:

df['txt'] = df['txt'].apply(lambda x: [clean_text(z) for z in x])

在这种方法中,applylambda一起使用来循环txt系列的每个元素,而简单的for循环(使用Python的列表推导表示)用于遍历txt子列表.

In this approach, apply is used with lambda to loop each element of the txt series, while a simple for-loop (expressed using Python's list comprehension) is utilized to iterate over each item in the txt sub-list.

我已经用data的以下值测试了该代码段:

I have tested that snippet with the following value for data:

data = {
    'value': [
        'abc.txt',
        'cda.txt',
    ],
    'txt':[
        [
            '2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart',
        ],
        [
            '2019/02/01-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart',
        ],
    ]
}

以下是控制台输出的片段,显示了转换前后的数据帧:

Here is a snippet of console output showing the dataframe before and after transformation:

>>> df

     value                                                txt
0  abc.txt  [2019/01/31-11:56:23.288258 1886     7F0ED4CDC...
1  cda.txt  [2019/02/01-11:56:23.288258 1886     7F0ED4CDC...

>>> df['txt'] = df['txt'].apply(lambda x: [clean_text(z) for z in x])

>>> df

     value                         txt
0  abc.txt  [asfasnfs remove datepart]
1  cda.txt  [asfasnfs remove datepart]
>>> 

这篇关于在列表Python上使用Apply时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆