替换大于pandas数据框中的数字的值 [英] Replacing values greater than a number in pandas dataframe

查看：54 发布时间：2020/5/24 2:06:33 python database pandas

本文介绍了替换大于pandas数据框中的数字的值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个很大的数据框，看起来像:

I have a large dataframe which looks as:

df1['A'].ix[1:3]
2017-01-01 02:00:00    [33, 34, 39]
2017-01-01 03:00:00    [3, 43, 9]

我想用11替换大于9的每个元素.

I want to replace each element greater than 9 with 11.

因此，上述示例的期望输出是:

So, the desired output for above example is:

df1['A'].ix[1:3]
2017-01-01 02:00:00    [11, 11, 11]
2017-01-01 03:00:00    [3, 11, 9]

我的实际数据框大约有20,000行，每行都有大小为2000的列表.

My actual dataframe has about 20,000 rows and each row has list of size 2000.

是否可以对每一行使用numpy.minimum函数?我认为它将比list comprehension方法更快?

Is there a way to use numpy.minimum function for each row? I assume that it will be faster than list comprehension method?

推荐答案

您可以将apply与list comprehension一起使用:

df1['A'] = df1['A'].apply(lambda x: [y if y <= 9 else 11 for y in x])
print (df1)
                                A
2017-01-01 02:00:00  [11, 11, 11]
2017-01-01 03:00:00    [3, 11, 9]

首先将更快的解决方案转换为numpy array，然后使用 numpy.where :

Faster solution is first convert to numpy array and then use numpy.where:

a = np.array(df1['A'].values.tolist())
print (a)
[[33 34 39]
 [ 3 43  9]]

df1['A'] = np.where(a > 9, 11, a).tolist()
print (df1)
                                A
2017-01-01 02:00:00  [11, 11, 11]
2017-01-01 03:00:00    [3, 11, 9]

这篇关于替换大于pandas数据框中的数字的值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

替换大于pandas数据框中的数字的值 [英] Replacing values greater than a number in pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

替换大于pandas数据框中的数字的值 [英] Replacing values greater than a number in pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭