根据python中的列表替换列中的几个值 [英] Replacing few values in a column based on a list in python

查看：262 发布时间：2020/5/24 1:20:21 python pandas

本文介绍了根据python中的列表替换列中的几个值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是关于stackoverflow的一个很好的解释的主题:

here is one good explained topic on stackoverflow: Replacing few values in a pandas dataframe column with another value

示例为:

BrandName Specialty
A          H
B          I
ABC        J
D          K
AB         L

解决方案是:

df['BrandName'] = df['BrandName'].replace(['ABC', 'AB'], 'A')

问题是我的数据帧有些不同，我连续有两个字符串:

The problem is my dataframe is a little bit different, I have two strings in a row:

BrandName Specialty
A          H
B          I
ABC B      J
D          K
AB         L

所需的输出仍然是:

BrandName Specialty
A          H
B          I
A B        J
D          K
A          L

我该如何实现?

推荐答案

使用regex=True进行Subtring替换:

Use regex=True for subtring replacement:

df['BrandName'] = df['BrandName'].replace(['ABC', 'AB'], 'A', regex=True)
print (df)
  BrandName Specialty
0         A         H
1         B         I
2       A B         J
3         D         K
4         A         L

另一种解决方案是必要的，如果需要避免其他子字符串中的替换值(例如未替换ABCD)，则需要使用正则表达式单词边界:

Another solution is necessary, if need to avoid replacement values in anaother substrings, like ABCD is not replaced, then need regex words boundaries:

print (df)
  BrandName Specialty
0    A ABCD         H
1         B         I
2     ABC B         J
3         D         K
4        AB         L


L = [r"\b{}\b".format(x) for x in ['ABC', 'AB']]

df['BrandName1'] = df['BrandName'].replace(L, 'A', regex=True)
df['BrandName2'] = df['BrandName'].replace(['ABC', 'AB'], 'A', regex=True)
print (df)
  BrandName Specialty BrandName1 BrandName2
0    A ABCD         H     A ABCD       A AD
1         B         I          B          B
2     ABC B         J        A B        A B
3         D         K          D          D
4        AB         L          A          A

编辑(来自提问者):

要加快速度，可以在这里查看:加快Python 3中数百万个正则表达式的替换速度

To speed it up, you can have a look here: Speed up millions of regex replacements in Python 3

最好的方法是trie方法:

def trie_regex_from_words(words):
    trie = Trie()
    for word in words:
        trie.add(word)
    return re.compile(r"\b" + trie.pattern() + r"\b", re.IGNORECASE)

union = trie_regex_from_words(strings)
df['BrandName'] = df['BrandName'].replace(union, 'A', regex=True)

这篇关于根据python中的列表替换列中的几个值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

根据python中的列表替换列中的几个值 [英] Replacing few values in a column based on a list in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

根据python中的列表替换列中的几个值 [英] Replacing few values in a column based on a list in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭