pandas 一次更新多列 [英] Pandas update multiple columns at once

查看：92 发布时间：2020/5/24 1:50:13 python pandas dataframe

本文介绍了 pandas 一次更新多列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试一次更新几个字段-我有两个数据源，并且试图对它们进行协调.我知道我可以进行一些丑陋的合并，然后删除列，但是希望下面的代码可以工作:

I'm trying to update a couple fields at once - I have two data sources and I'm trying to reconcile them. I know I could do some ugly merging and then delete columns, but was expecting this code below to work:

df = pd.DataFrame([['A','B','C',np.nan,np.nan,np.nan],
                  ['D','E','F',np.nan,np.nan,np.nan],[np.nan,np.nan,np.nan,'a','b','d'],
                  [np.nan,np.nan,np.nan,'d','e','f']], columns = ['Col1','Col2','Col3','col1_v2','col2_v2','col3_v2'])

print df

 Col1 Col2 Col3 col1_v2 col2_v2 col3_v2
0    A    B    C     NaN     NaN     NaN
1    D    E    F     NaN     NaN     NaN
2  NaN  NaN  NaN       a       b       d
3  NaN  NaN  NaN       d       e       f

#update 
df.loc[df['Col1'].isnull(),['Col1','Col2', 'Col3']] = df[['col1_v2','col2_v2','col3_v2']]

print df

 Col1 Col2 Col3 col1_v2 col2_v2 col3_v2
0    A    B    C     NaN     NaN     NaN
1    D    E    F     NaN     NaN     NaN
2  NaN  NaN  NaN       a       b       d
3  NaN  NaN  NaN       d       e       f

我想要的输出将是:

 Col1 Col2 Col3 col1_v2 col2_v2 col3_v2
0    A    B    C     NaN     NaN     NaN
1    D    E    F     NaN     NaN     NaN
2    a    b    c       a       b       d
3    d    e    f       d       e       f

我敢打赌，它与切片上的更新/设置有关，但是我始终使用.loc来更新值，而不是一次在多个列上进行更新.

I'm betting it has to do with updating/setting on a slice, but I always use .loc to update values, just not on multiple columns at once.

我觉得有一个简单的方法可以做到，我只是想念它，任何想法/建议都将受到欢迎！

I feel like there's an easy way to do this that I'm just missing, any thoughts/suggestions would be welcome!

编辑以在下面反映解决方案 感谢您对索引的评论.但是，我对此有疑问，因为它与系列有关.如果我想以类似的方式更新单个系列，可以执行以下操作:

Edit to reflect solution below Thanks for the comment on the indexes. However, I have a question about this as it relates to series. If I wanted to update an individual series in a similar manner, I could do something like this:

df.loc[df['Col1'].isnull(),['Col1']] = df['col1_v2']

print df

  Col1 Col2 Col3 col1_v2 col2_v2 col3_v2
0    A    B    C     NaN     NaN     NaN
1    D    E    F     NaN     NaN     NaN
2    a  NaN  NaN       a       b       d
3    d  NaN  NaN       d       e       f

请注意，这里我没有考虑索引，而是过滤为2x1系列并将其设置为等于4x1系列，但它可以正确处理.有什么想法吗?我已经尝试了一段时间，试图对功能有所了解，但是我想对底层的机制/规则没有足够的了解

Note that I didn't account for the indexes here, I filtered to a 2x1 series and set that equal to a 4x1 series, yet it handled it correctly. Thoughts? I'm trying to understand the functionality a bit better of something I've used for a while, but I guess don't have a full grasp of the underlying mechanism/rule

推荐答案

您要替换的

print df.loc[df['Col1'].isnull(),['Col1','Col2', 'Col3']]

  Col1 Col2 Col3
2  NaN  NaN  NaN
3  NaN  NaN  NaN

使用:

replace_with_this = df.loc[df['Col1'].isnull(),['col1_v2','col2_v2', 'col3_v2']]
print replace_with_this

  col1_v2 col2_v2 col3_v2
2       a       b       d
3       d       e       f

似乎合理.但是，进行分配时，需要考虑索引对齐，其中包括列.

Seems reasonable. However, when you do the assignment, you need to account for index alignment, which includes columns.

因此，这应该可行:

df.loc[df['Col1'].isnull(),['Col1','Col2', 'Col3']] = replace_with_this.values

print df

  Col1 Col2 Col3 col1_v2 col2_v2 col3_v2
0    A    B    C     NaN     NaN     NaN
1    D    E    F     NaN     NaN     NaN
2    a    b    d       a       b       d
3    d    e    f       d       e       f

我最后使用.values来计算列数.这会从replace_with_this数据框中删除列信息，而只是在适当的位置使用了值.

I accounted for columns by using .values at the end. This stripped the column information from the replace_with_this dataframe and just used the values in the appropriate positions.

这篇关于 pandas 一次更新多列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 一次更新多列 [英] Pandas update multiple columns at once

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 一次更新多列 [英] Pandas update multiple columns at once

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭