根据 pandas 中的另一个值更改一个值 [英] Change one value based on another value in pandas

查看:49
本文介绍了根据 pandas 中的另一个值更改一个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将我的Stata代码重新编程为Python以提高速度,而我被指示指向PANDAS.但是,我很难集中精力处理数据.

I'm trying to reprogram my Stata code into Python for speed improvements, and I was pointed in the direction of PANDAS. I am, however, having a hard time wrapping my head around how to process the data.

假设我要遍历列标题"ID"中的所有值.如果该ID与特定数字匹配,那么我想更改两个相应的值FirstName和LastName.

Let's say I want to iterate over all values in the column head 'ID.' If that ID matches a specific number, then I want to change two corresponding values FirstName and LastName.

在Stata中,它看起来像这样:

In Stata it looks like this:

replace FirstName = "Matt" if ID==103
replace LastName =  "Jones" if ID==103

因此,这会将FirstName中与ID == 103的值相对应的所有值替换为Matt.

So this replaces all values in FirstName that correspond with values of ID == 103 to Matt.

在PANDAS中,我正在尝试类似的事情

In PANDAS, I'm trying something like this

df = read_csv("test.csv")
for i in df['ID']:
    if i ==103:
          ...

不确定从这里要去哪里.有什么想法吗?

Not sure where to go from here. Any ideas?

推荐答案

一种选择是使用Python的切片和索引功能从逻辑上评估条件所在的位置并覆盖其中的数据.

One option is to use Python's slicing and indexing features to logically evaluate the places where your condition holds and overwrite the data there.

假设您可以使用pandas.read_csv将数据直接加载到pandas中,那么以下代码可能对您有所帮助.

Assuming you can load your data directly into pandas with pandas.read_csv then the following code might be helpful for you.

import pandas
df = pandas.read_csv("test.csv")
df.loc[df.ID == 103, 'FirstName'] = "Matt"
df.loc[df.ID == 103, 'LastName'] = "Jones"

如评论中所述,您也可以一次性完成对两列的分配:

As mentioned in the comments, you can also do the assignment to both columns in one shot:

df.loc[df.ID == 103, ['FirstName', 'LastName']] = 'Matt', 'Jones'

请注意,您需要pandas版本0.11或更高版本才能使用loc进行覆盖分配操作.

Note that you'll need pandas version 0.11 or newer to make use of loc for overwrite assignment operations.

另一种实现方法是使用所谓的链式分配.这种行为的稳定性较差,因此不被认为是最佳解决方案(它是在文档中明确劝阻),但是了解以下信息很有用:

Another way to do it is to use what is called chained assignment. The behavior of this is less stable and so it is not considered the best solution (it is explicitly discouraged in the docs), but it is useful to know about:

import pandas
df = pandas.read_csv("test.csv")
df['FirstName'][df.ID == 103] = "Matt"
df['LastName'][df.ID == 103] = "Jones"

这篇关于根据 pandas 中的另一个值更改一个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆