将函数应用于 pandas 数据框 [英] Applying a function to pandas dataframe

查看:67
本文介绍了将函数应用于 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对pandas dataframe进行一些文本分析,但是流程上有些麻烦.另外,也许我只是不明白... PS-我是python初学者.

I'm trying to perform some text analysis on a pandas dataframe, but am having some trouble with the flow. Alternatively, maybe I just not getting it... PS - I'm a python beginner-ish.

数据框示例:

df = pd.DataFrame({'Document' : ['a','1','a', '6','7','N'], 'Type' : ['7', 'E', 'Y', '6', 'C', '9']})


     Document   Type
0    a          7
1    1          E
2    a          Y
3    6          6
4    7          C
5    N          9

我正在尝试构建一个流程,如果文档"或类型"不是数字,则执行某项操作.

I'm trying to build a flow that if 'Document' or 'Type' is a number or not, do something.

这是一个简单的函数,用于返回文档"是否为数字(已编辑以显示我如何尝试某些(如果/然后在字段上流动):

Here is a simple function to return whether 'Document' is a number (edited to show how I am trying some if/then flow on the field):

def fn(dfname):
    if dfname['Document'].apply(str.isdigit):
        dfname['Check'] = 'Y'
    else:
        dfname['Check'] = 'N'

现在,我将它apply放入数据框:

Now, I apply it to the dataframe:

df.apply(fn(df), axis=0)

我收到此错误:

TypeError: ("'NoneType' object is not callable", u'occurred at index Document')

从错误消息看来,我没有正确处理索引.谁能看到我要去哪里错了?

From the error message, it looks that I am not handling the index correctly. Can anyone see where I am going wrong?

最后-这可能与该问题相关,也可能与该问题无关,但是我真的很努力地尝试indexespandas中的工作方式.我认为我遇到的索引问题比其他任何问题都要多.

Lastly - this may or may not be related to the issue, but I am really struggling with how indexes work in pandas. I think I have run into more issues with the index than any other issue.

推荐答案

您已经关闭.

关于应用,您必须意识到的事情是,您需要编写对标量值进行运算的函数并返回所需的结果.考虑到这一点:

The thing you have to realize about apply is you need to write functions that operate on scalar values and return the result that you want. With that in mind:

import pandas as pd

df = pd.DataFrame({'Document' : ['a','1','a', '6','7','N'], 'Type' : ['7', 'E', 'Y', '6', 'C', '9']})

def fn(val):
    if str(val).isdigit():
        return 'Y'
    else:
        return 'N'

df['check'] = df['Document'].apply(fn)

给我:

  Document Type check
0        a    7     N
1        1    E     Y
2        a    Y     N
3        6    6     Y
4        7    C     Y
5        N    9     N

只想澄清一下,在系列上使用apply时,您应该编写接受标量值的函数.但是,在DataFrame上使用apply时,函数应该接受完整的列(默认为axis=0时)或完整的行(axis=1时).

Just want to clarify that when using apply on a series, you should write function that accept scalar values. When using apply on a DataFrame, however, the functions should accept either full columns (when axis=0 -- the default) or full rows (when axis=1).

这篇关于将函数应用于 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆