数据框应用不接受轴参数 [英] Dataframe apply doesn't accept axis argument

查看:21
本文介绍了数据框应用不接受轴参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据帧: datarules .

>>>数据>>>规则供应商规则0 谷歌 0 谷歌1 谷歌 1 戴尔2 谷歌 2 macbook

在计算每个供应商和规则之间的 Levenshtein 相似度后,我尝试将两个新列添加到 data 数据框中.所以我的数据框最好包含如下所示的列:

>>>数据供应商规则相似性0 谷歌谷歌 0.8

到目前为止,我正在尝试执行一个 apply 函数,该函数将返回此结构,但数据框 apply 不接受 axis 参数.

<预><代码>>>>对于 rules.iterrows() 中的 index,r:... data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)...回溯(最近一次调用最后一次):文件<stdin>",第 2 行,在 <module> 中文件/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py",第2220行,适用映射 = lib.map_infer(values, f, convert=convert_dtype)文件pandas/src/inference.pyx",第 1088 行,在 pandas.lib.map_infer (pandas/lib.c:62658)文件/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py",第2209行,在<lambda>f = lambda x: func(x, *args, **kwds)类型错误:<lambda>() 得到了意外的关键字参数axis"

有人可以帮我弄清楚我做错了什么吗?我所做的任何更改都只会产生新的错误.谢谢

解决方案

您正在调用 apply 对于它没有 axisarg 因此是错误.

如果你这样做了:

data[['rule','similarity']]=data[['vendor']].apply(lambda row:[r[0],ratio(row[0],r[0])],轴=1)

然后这会生成一个单列 df ,这将起作用

或者只是删除 axis 参数:

data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])])

更新

看看您在做什么,您需要针对每个供应商计算每个规则的编辑比例.

您可以这样做:

data['vendor'].apply(lambda row: rules['rule'].apply(lambda x: ratio(x, row))

我认为应该根据每条规则计算每个供应商的比率.

I have two dataframes: data and rules .

>>>data                            >>>rules
   vendor                             rule
0  googel                           0 google
1  google                           1 dell
2  googly                           2 macbook

I am trying to add two new columns into the data dataframe after computing the Levenshtein similarity between each vendor and rule. So my dataframe should ideally contain columns looking like this:

>>>data
  vendor   rule    similarity
0 googel   google    0.8

So far I am trying to perform an apply function that will return me this structure, but the dataframe apply is not accepting the axis argument.

>>> for index,r in rules.iterrows():
...     data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)
...
Traceback (most recent call last):

File "<stdin>", line 2, in <module>

File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2220, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/src/inference.pyx", line 1088, in pandas.lib.map_infer (pandas/lib.c:62658)
File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2209, in <lambda>
f = lambda x: func(x, *args, **kwds)

TypeError: <lambda>() got an unexpected keyword argument 'axis'

Could someone please help me figure out what I am doing wrong? Any change I make is just creating new errors.Thank you

解决方案

You're calling the Series version of apply for which it doesn't make sense to have an axis arg hence the error.

If you did:

data[['rule','similarity']]=data[['vendor']].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)

then this makes a single column df for which this would work

Or just remove the axis arg:

data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])])

update

Looking at what you're doing, you need to calculate the levenshtein ratio for each rule against every vendor.

You can do this by:

data['vendor'].apply(lambda row: rules['rule'].apply(lambda x: ratio(x, row))

this I think should calculate the ratio for each vendor against every rule.

这篇关于数据框应用不接受轴参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆