为什么 pandas 地图比列表理解慢 [英] Why is pandas map slower than list comprehension

查看:46
本文介绍了为什么 pandas 地图比列表理解慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人知道为什么熊猫/numpy地图比列表理解慢吗?我以为我可以优化我的代码,用map代替列表推导.由于map不需要列表附加操作.

Does someone know why pandas/numpy map is slower then list comprehension? I thought I could optimize my code replacing the list comprehensions by map. Since map doesn't need the list append operation.

这是一项测试:

df = pd.DataFrame(range(100000))

列表理解:

%timeit -n 10 df["A"] = [x for x in df[0]]

#10 loops, best of 3: 550 ms per loop

熊猫地图

%timeit -n 10 df["A"] = df[0].map(lambda x: x)

#10 loops, best of 3: 797 ms per loop

更新基于注释波纹管-列表理解和映射调用相同的函数f,列表理解更快

Update based on comment bellow - list comprehension and map calling same function f, list comprehension faster

def f(x):
    return x

%timeit -n 100 df["A"] = df[0].map(f)

#100 loops, best of 3: 475 ms per loop

%timeit -n 100 df["A"] = [f(x) for x in df[0]]

#100 loops, best of 3: 399 ms per loop

推荐答案

这是我的结果:

列表理解:

In [33]: %timeit df["A"] = [x for x in df[0]]
10 loops, best of 3: 72.6 ms per loop

简单的列分配:

In [34]: %timeit df["A"] = df[0]
The slowest run took 5.75 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 661 µs per loop

使用 .map()方法:

In [35]: map_df = pd.Series(np.random.randint(0, 10**6, 100000))

In [36]: %timeit df["A"] = df[0].map(map_df)
10 loops, best of 3: 19.8 ms per loop

这篇关于为什么 pandas 地图比列表理解慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆