pandas 中的loc函数 [英] loc function in pandas

查看:99
本文介绍了 pandas 中的loc函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以通过下面所示的示例解释为什么loc在python大熊猫中使用吗?

Can anybody explain why is loc used in python pandas with examples like shown below?

for i in range(0, 2):
  for j in range(0, 3):
    df.loc[(df.Age.isnull()) & (df.Gender == i) & (df.Pclass == j+1),
            'AgeFill'] = median_ages[i,j]

推荐答案

此处建议使用.loc,因为方法df.Age.isnull()df.Gender == idf.Pclass == j+1可能返回数据帧切片的视图或可能会返回副本.这会使大熊猫感到困惑.

The use of .loc is recommended here because the methods df.Age.isnull(), df.Gender == i and df.Pclass == j+1 may return a view of slices of the data frame or may return a copy. This can confuse pandas.

如果不使用.loc,最终将依次调用所有3个条件,这将导致您出现称为链式索引的问题.但是,当您使用.loc时,只需一步即可访问所有条件,大熊猫不再困惑.

If you don't use .loc you end up calling all 3 conditions in series which leads you to a problem called chained indexing. When you use .loc however you access all your conditions in one step and pandas is no longer confused.

您可以在简单的答案是,尽管您通常可以不用使用.loc而只需输入(例如)

The simple answer is that while you can often get away with not using .loc and simply typing (for example)

df['Age_fill'][(df.Age.isnull()) & (df.Gender == i) & (df.Pclass == j+1)] \
                                                          = median_ages[i,j]

您将始终收到SettingWithCopy警告,您的代码对此会有些混乱.

you'll always get the SettingWithCopy warning and your code will be a little messier for it.

根据我的经验,.loc花了我一段时间才得以解决,更新代码有点烦人.但这真的非常简单而且非常直观:df.loc[row_index,col_indexer].

In my experience .loc has taken me a while to get my head around and it's been a bit annoying updating my code. But it's really super simple and very intuitive: df.loc[row_index,col_indexer].

有关更多信息,请参见建立索引并选择数据.

For more information see the pandas documentation on Indexing and Selecting Data.

这篇关于 pandas 中的loc函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆