AttributeError: 'PandasExprVisitor' 对象没有属性 'visit_Ellipsis',使用 Pandas eval [英] AttributeError: 'PandasExprVisitor' object has no attribute 'visit_Ellipsis', using pandas eval

查看:21
本文介绍了AttributeError: 'PandasExprVisitor' 对象没有属性 'visit_Ellipsis',使用 Pandas eval的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一系列的表格:

s

0    [133, 115, 3, 1]
1    [114, 115, 2, 3]
2      [51, 59, 1, 1]
dtype: object

注意它的元素是字符串:

s[0]
'[133, 115, 3, 1]'

我正在尝试使用 pd.eval 将此字符串解析为一列列表.这适用于此示例数据.

I'm trying to use pd.eval to parse this string into a column of lists. This works for this sample data.

pd.eval(s)

array([[133, 115, 3, 1],
       [114, 115, 2, 3],
       [51, 59, 1, 1]], dtype=object)

但是,在更大的数据(10K 数量级)上,这会失败!

However, on much larger data (order of 10K), this fails miserably!

len(s)
300000

pd.eval(s)
AttributeError: 'PandasExprVisitor' object has no attribute 'visit_Ellipsis'

我在这里错过了什么?函数或我的数据有问题吗?

What am I missing here? Is there something wrong with the function or my data?

推荐答案

你的数据很好,pandas.eval 有问题,但不是你想的那样.有一个提示 在相关的 github 问题页面 敦促我仔细看看文档.

Your data is fine, and pandas.eval is buggy, but not in the way you think. There is a hint in the relevant github issue page that urged me to take a closer look at the documentation.

pandas.eval(expr, parser='pandas', engine=None, truediv=True, local_dict=None,
            global_dict=None, resolvers=(), level=0, target=None, inplace=False)

    Evaluate a Python expression as a string using various backends.

    Parameters:
        expr: str or unicode
            The expression to evaluate. This string cannot contain any Python
            statements, only Python expressions.
        [...]

如您所见,记录的行为是将 strings 传递给 pd.eval,这与 eval 的一般(和预期)行为一致/exec 类函数.你传递一个字符串,最终得到一个任意对象.

As you can see, the documented behaviour is to pass strings to pd.eval, in line with the general (and expected) behaviour of the eval/exec class of functions. You pass a string, and end up with an arbitrary object.

在我看来,pandas.eval 有问题,因为它不会拒绝 Series 前面的输入 expr,导致它在模棱两可的情况下进行猜测.为漂亮打印而设计的 Series' __repr__ 的默认缩短会极大地影响您的结果,这一事实就是这种情况的最佳证明.

As I see it, pandas.eval is buggy because it doesn't reject the Series input expr up front, leading it to guess in the face of ambiguity. The fact that the default shortening of the Series' __repr__ designed for pretty printing can drastically affect your result is the best proof of this situation.

解决方案是从 XY 问题退后一步,并使用正确的工具来转换您的数据,并且最好完全停止使用 pandas.eval 用于此目的.即使在 Series 很小的工作案例中,你也不能确定未来的 Pandas 版本不会完全破坏这个功能".

The solution is then to step back from the XY problem, and use the right tool to convert your data, and preferably stop using pandas.eval for this purpose entirely. Even in the working cases where the Series is small, you can't really be sure that future pandas versions don't break this "feature" completely.

这篇关于AttributeError: 'PandasExprVisitor' 对象没有属性 'visit_Ellipsis',使用 Pandas eval的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆