Pandas 中 map、applymap 和 apply 方法的区别 [英] Difference between map, applymap and apply methods in Pandas

查看:33
本文介绍了Pandas 中 map、applymap 和 apply 方法的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你能告诉我什么时候用基本的例子来使用这些矢量化方法吗?

我看到 map 是一个 Series 方法,而其余的是 DataFrame 方法.我对 applyapplymap 方法感到困惑.为什么我们有两种方法可以将函数应用于 DataFrame?同样,说明用法的简单示例会很棒!

解决方案

Comparing

<块引用>

脚注

  1. map 传递字典/系列时,将根据该字典/系列中的键映射元素.缺失值将被记录为输出中的 NaN.

  2. applymap 在更新的版本中针对某些操作进行了优化.你会发现 applymapapply 稍微快一点某些情况下.我的建议是测试它们并使用任何有效的方法更好.

  3. map 针对元素映射和转换进行了优化.涉及字典或系列的操作将使熊猫能够使用更快的代码路径以获得更好的性能.

  4. Series.apply 返回一个用于聚合操作的标量,否则返回 Series.DataFrame.apply 也类似.注意 apply 也有使用某些 NumPy 函数(例如 mean)调用时的快速路径,sum

Can you tell me when to use these vectorization methods with basic examples?

I see that map is a Series method whereas the rest are DataFrame methods. I got confused about apply and applymap methods though. Why do we have two methods for applying a function to a DataFrame? Again, simple examples which illustrate the usage would be great!

解决方案

Comparing map, applymap and apply: Context Matters

First major difference: DEFINITION

  • map is defined on Series ONLY
  • applymap is defined on DataFrames ONLY
  • apply is defined on BOTH

Second major difference: INPUT ARGUMENT

  • map accepts dicts, Series, or callable
  • applymap and apply accept callables only

Third major difference: BEHAVIOR

  • map is elementwise for Series
  • applymap is elementwise for DataFrames
  • apply also works elementwise but is suited to more complex operations and aggregation. The behaviour and return value depends on the function.

Fourth major difference (the most important one): USE CASE

  • map is meant for mapping values from one domain to another, so is optimised for performance (e.g., df['A'].map({1:'a', 2:'b', 3:'c'}))
  • applymap is good for elementwise transformations across multiple rows/columns (e.g., df[['A', 'B', 'C']].applymap(str.strip))
  • apply is for applying any function that cannot be vectorised (e.g., df['sentences'].apply(nltk.sent_tokenize)).

Also see When should I (not) want to use pandas apply() in my code? for a writeup I made a while back on the most appropriate scenarios for using apply (note that there aren't many, but there are a few -- apply is generally slow`).


#Summarising

Footnotes

  1. map when passed a dictionary/Series will map elements based on the keys in that dictionary/Series. Missing values will be recorded as NaN in the output.

  2. applymap in more recent versions has been optimised for some operations. You will find applymap slightly faster than apply in some cases. My suggestion is to test them both and use whatever works better.

  3. map is optimised for elementwise mappings and transformation. Operations that involve dictionaries or Series will enable pandas to use faster code paths for better performance.

  4. Series.apply returns a scalar for aggregating operations, Series otherwise. Similarly for DataFrame.apply. Note that apply also has fastpaths when called with certain NumPy functions such as mean, sum, etc.

这篇关于Pandas 中 map、applymap 和 apply 方法的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆