Pandas 根据另一列的条件有选择地覆盖列中的值 [英] Pandas overwrite values in column selectively based on condition from another column

查看:56
本文介绍了Pandas 根据另一列的条件有选择地覆盖列中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Pandas 中有一个包含四列的数据框.数据由字符串组成.示例:

I have a dataframe in pandas with four columns. The data consists of strings. Sample:

          A                  B                C      D
0         2          asicdsada          v:cVccv      u
1         4     ascccaiiidncll     v:cVccv:ccvc      u
2         9                sca              V:c      u
3        11               lkss             v:cv      u
4        13              lcoao            v:ccv      u
5        14           wuduakkk         V:ccvcv:      u

如果 Col D 中的 Col C 包含子字符串 'V'(区分大小写),我想用字符串 'a' 替换 Col D 中的字符串 'u'.预期结果:

I want to replace the string 'u' in Col D with the string 'a' if Col C in that row contains the substring 'V' (case sensitive). Desired outcome:

          A                  B                C      D
0         2          asicdsada          v:cVccv      a
1         4     ascccaiiidncll     v:cVccv:ccvc      a
2         9                sca              V:c      a
3        11               lkss             v:cv      u
4        13              lcoao            v:ccv      u
5        14           wuduakkk         V:ccvcv:      a

我更喜欢覆盖 D 列中已有的值,而不是分配两个不同的值,因为我想稍后在不同的条件下有选择地再次覆盖其中的一些值.

I prefer to overwrite the value already in Column D, rather than assign two different values, because I'd like to selectively overwrite some of these values again later, under different conditions.

这似乎应该有一个简单的解决方案,但我无法弄清楚,并且无法在其他已回答的问题中找到完全适用的解决方案.

It seems like this should have a simple solution, but I cannot figure it out, and haven't been able to find a fully applicable solution in other answered questions.

df.ix[1]["D"] = "a"

更改单个值.

df.ix[:]["C"].str.contains("V")

返回一系列布尔值,但我不确定如何处理它.我已经尝试了 .loc、apply、contains、re.search 和 for 循环的许多组合,我得到了错误或替换了 D 列中的每个值.我是 Pandas/python 的新手,所以很难知道是否我的语法、方法或我什至需要做的事情的概念化(可能是上述所有内容).

returns a series of booleans, but I am not sure what to do with it. I have tried many many combinations of .loc, apply, contains, re.search, and for loops, and I get either errors or replace every value in column D. I'm a novice with pandas/python so it's hard to know whether my syntax, methods, or conceptualization of what I even need to do are off (probably all of the above).

推荐答案

正如您已经尝试过的,使用 str.contains 来获取布尔系列,然后使用 .loc 说更改这些行和 D 列".例如:

As you've already tried, use str.contains to get a boolean Series, and then use .loc to say "change these rows and the D column". For example:

In [5]: df.loc[df["C"].str.contains("V"), "D"] = "a"

In [6]: df
Out[6]: 
    A               B             C  D
0   2       asicdsada       v:cVccv  a
1   4  ascccaiiidncll  v:cVccv:ccvc  a
2   9             sca           V:c  a
3  11            lkss          v:cv  u
4  13           lcoao         v:ccv  u
5  14        wuduakkk      V:ccvcv:  a

(避免使用 .ix - 现在正式弃用了.)

(Avoid using .ix -- it's officially deprecated now.)

这篇关于Pandas 根据另一列的条件有选择地覆盖列中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆