pandas 按两列值过滤行，不区分大小写 [英] pandas filter rows by two column values with case insenstive

查看：44 发布时间：2021/6/13 20:16:52 python pandas

本文介绍了 pandas 按两列值过滤行，不区分大小写的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个简单的数据框，如下所示:

I have a simple dataframe as follows:

Last Known Date ConfigredValue  ReferenceValue
0   24-Jun-17   False   FALSE
1   25-Jun-17   FALSE   FALSE
2   26-Jun-17   TRUE    FALSE
3   27-Jun-17   FALSE   FALSE
4   28-Jun-17   false   FALSE

如果我执行以下命令

df=df[df['ConfigredValue']!=dfs['ReferenceValue']]

然后我得到如下

0   24-Jun-17   False   FALSE
2   26-Jun-17   TRUE    FALSE
4   28-Jun-17   false   FALSE

但我想要不区分大小写的过滤器(case=False)

But I want the filter with case insensitive (case=False)

我想要以下输出:

2   26-Jun-17   TRUE    FALSE

请建议，如何过滤不区分大小写的数据(case=false)

Please suggest, how to get filtered case insensitive data(case=false)

选项 1:转换为小写或大写并进行比较

最简单的是在检查相等之前将两列转换为lower(或upper):

Option 1: convert to lowercase or to uppercase and compare

The simplest is to convert the two columns to lower (or to upper) before checking for equality:

df=df[df['ConfigredValue'].str.lower()!=df['ReferenceValue'].str.lower()]

或

df=df[df['ConfigredValue'].str.upper()!=df['ReferenceValue'].str.upper()]

输出:

Out: 
  Last Known Date ConfigredValue ReferenceValue
2    2  26-Jun-17           TRUE          FALSE

<小时>

选项 2:比较长度

在这种特殊情况下，您可以简单地比较 TRUE 和 True 的长度，无论字符串是大写还是小写，它们都相同:

Option 2: Compare the lengths

In this particuler case, you can simply compare the lengths of TRUE and True, they are the same wether the string is upper or lower case:

df[df['ConfigredValue'].str.len()!=df['ReferenceValue'].str.len()]

输出:

Out: 
  Last Known Date ConfigredValue ReferenceValue
2    2  26-Jun-17           TRUE          FALSE

<小时>

选项 3:矢量化标题

str.title() 在@0p3n5ourcE 答案中也被建议，这是它的矢量化版本:

Option 3: Vectorized title

str.title() was also suggested in @0p3n5ourcE answer, here's a vectorized version of it:

df[df['ConfigredValue'].str.title()!=df['ReferenceValue'].str.title()]

<小时>

执行时间

对速度进行基准测试表明 str.len() 有点快

In [35]: timeit df[df['ConfigredValue'].str.lower()!=df['ReferenceValue'].str.lower()]
1000 loops, best of 3: 496 µs per loop

In [36]: timeit df[df['ConfigredValue'].str.upper()!=df['ReferenceValue'].str.upper()]
1000 loops, best of 3: 496 µs per loop

In [37]: timeit df[df['ConfigredValue'].str.title()!=df['ReferenceValue'].str.title()]
1000 loops, best of 3: 495 µs per loop

In [38]: timeit df[df['ConfigredValue'].str.len()!=df['ReferenceValue'].str.len()]
1000 loops, best of 3: 479 µs per loop

这篇关于 pandas 按两列值过滤行，不区分大小写的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 按两列值过滤行，不区分大小写 [英] pandas filter rows by two column values with case insenstive

问题描述

推荐答案

选项 1:转换为小写或大写并进行比较

Option 1: convert to lowercase or to uppercase and compare

选项 2:比较长度

Option 2: Compare the lengths

选项 3:矢量化标题

Option 3: Vectorized title

执行时间

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 按两列值过滤行，不区分大小写 [英] pandas filter rows by two column values with case insenstive

问题描述

推荐答案

选项 1:转换为小写或大写并进行比较

Option 1: convert to lowercase or to uppercase and compare

选项 2:比较长度

Option 2: Compare the lengths

选项 3:矢量化标题

Option 3: Vectorized title

执行时间

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭