pandas 会根据另一列的条件有选择地覆盖列中的值 [英] Pandas overwrite values in column selectively based on condition from another column

查看:80
本文介绍了 pandas 会根据另一列的条件有选择地覆盖列中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在熊猫中有一个四列的数据框.数据由字符串组成.样本:

I have a dataframe in pandas with four columns. The data consists of strings. Sample:

          A                  B                C      D
0         2          asicdsada          v:cVccv      u
1         4     ascccaiiidncll     v:cVccv:ccvc      u
2         9                sca              V:c      u
3        11               lkss             v:cv      u
4        13              lcoao            v:ccv      u
5        14           wuduakkk         V:ccvcv:      u

如果该行中的Col C包含子字符串'V'(区分大小写),我想用字符串'a'替换列D中的字符串'u'. 期望的结果:

I want to replace the string 'u' in Col D with the string 'a' if Col C in that row contains the substring 'V' (case sensitive). Desired outcome:

          A                  B                C      D
0         2          asicdsada          v:cVccv      a
1         4     ascccaiiidncll     v:cVccv:ccvc      a
2         9                sca              V:c      a
3        11               lkss             v:cv      u
4        13              lcoao            v:ccv      u
5        14           wuduakkk         V:ccvcv:      a

我宁愿覆盖D列中已经存在的值,而不是分配两个不同的值,因为我想稍后在不同条件下再次有选择地覆盖其中一些值.

I prefer to overwrite the value already in Column D, rather than assign two different values, because I'd like to selectively overwrite some of these values again later, under different conditions.

似乎应该有一个简单的解决方案,但我无法弄清楚,也无法在其他已回答的问题中找到完全适用的解决方案.

It seems like this should have a simple solution, but I cannot figure it out, and haven't been able to find a fully applicable solution in other answered questions.

df.ix[1]["D"] = "a"

更改单个值.

df.ix[:]["C"].str.contains("V")

返回一系列布尔值,但是我不确定该如何处理.我尝试了.loc,apply,contains,re.search和for循环的许多组合,但遇到错误或替换D列中的每个值.我是pandas/python的新手,因此很难知道是否我的语法,方法或我什至需要做的事情的概念化都关闭了(可能以上所有).

returns a series of booleans, but I am not sure what to do with it. I have tried many many combinations of .loc, apply, contains, re.search, and for loops, and I get either errors or replace every value in column D. I'm a novice with pandas/python so it's hard to know whether my syntax, methods, or conceptualization of what I even need to do are off (probably all of the above).

推荐答案

您已经尝试过使用str.contains获取布尔系列,然后使用.loc说更改这些行和D列" ".例如:

As you've already tried, use str.contains to get a boolean Series, and then use .loc to say "change these rows and the D column". For example:

In [5]: df.loc[df["C"].str.contains("V"), "D"] = "a"

In [6]: df
Out[6]: 
    A               B             C  D
0   2       asicdsada       v:cVccv  a
1   4  ascccaiiidncll  v:cVccv:ccvc  a
2   9             sca           V:c  a
3  11            lkss          v:cv  u
4  13           lcoao         v:ccv  u
5  14        wuduakkk      V:ccvcv:  a

(避免使用.ix-现在已正式弃用.)

(Avoid using .ix -- it's officially deprecated now.)

这篇关于 pandas 会根据另一列的条件有选择地覆盖列中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆