使用字典更新pandas DataFrame行 [英] Updating a pandas DataFrame row with a dictionary

查看:282
本文介绍了使用字典更新pandas DataFrame行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我不理解的pandas DataFrames中发现了一个行为.

I've found a behavior in pandas DataFrames that I don't understand.

df = pd.DataFrame(np.random.randint(1, 10, (3, 3)), index=['one', 'one', 'two'], columns=['col1', 'col2', 'col3'])
new_data = pd.Series({'col1': 'new', 'col2': 'new', 'col3': 'new'})
df.iloc[0] = new_data
# resulting df looks like:

#       col1    col2    col3
#one    new     new     new
#one    9       6       1
#two    8       3       7

但是,如果我尝试添加字典,我会得到:

But if I try to add a dictionary instead, I get this:

new_data = {'col1': 'new', 'col2': 'new', 'col3': 'new'}
df.iloc[0] = new_data
#
#         col1  col2    col3
#one      col2  col3    col1
#one      2     1       7
#two      5     8       6

为什么会这样?在编写此问题的过程中,我意识到df.loc最有可能仅从new_data中获取了密钥,这也解释了为什么值不正确.但是,为什么会这样呢?如果我尝试从字典创建DataFrame,它会像对待键一样处理键:

Why is this happening? In the process of writing up this question, I realized that most likely df.loc is only taking the keys from new_data, which also explains why the values are out of order. But, again, why is this the case? If I try to create a DataFrame from a dictionary, it handles the keys as if they were columns:

pd.DataFrame([new_data])

#    col1   col2    col3
#0  new     new     new

为什么不是df.loc中的默认行为?

Why is that not the default behavior in df.loc?

推荐答案

这是字典的迭代方式与熊猫系列的处理方式之间的区别.

It's the difference between how a dictionary iterates and how a pandas series is treated.

pandas系列在分配给行时将其索引匹配到列,如果分配给列则将其索引匹配到列.之后,它会分配与该匹配索引或列对应的值.

A pandas series matches it's index to columns when being assigned to a row and matches to index if being assigned to a column. After that, it assigns the value that corresponds to that matched index or column.

当对象不是具有方便索引对象匹配的pandas对象时,pandas将遍历该对象.字典通过其键进行迭代,这就是为什么您在该行插槽中看到字典键的原因.字典未排序,这就是为什么您在该行中看到乱序键的原因.

When an object is not a pandas object with a convenient index object to match off of, pandas will iterate through the object. A dictionary iterates through it's keys and that's why you see the dictionary keys in that rows slots. Dictionaries are not sorted and that's why you see shuffled keys in that row.

这篇关于使用字典更新pandas DataFrame行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆