用 pandas 替换字符串中字符的所有但最后一次出现的字符 [英] Replace all but last occurrences of a character in a string with pandas

查看：122 发布时间：2020/5/24 1:58:16 python regex string pandas

本文介绍了用 pandas 替换字符串中字符的所有但最后一次出现的字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用Pandas删除字符串中除最后一个期间外的所有内容，如下所示:

using Pandas to remove all but last period in a string like so:

s = pd.Series(['1.234.5','123.5','2.345.6','678.9'])
counts = s.str.count('\.')
target = counts==2
target
0     True
1    False
2     True
3    False
dtype: bool

s = s[target].str.replace('\.','',1)
s
0    1234.5
2    2345.6
dtype: object

但是，我想要的输出是:

my desired output, however, is:

0    1234.5
1    123.5
2    2345.6
3    678.9
dtype: object

replace命令和mask目标似乎正在丢弃未替换的值，我看不出如何解决此问题.

The replace command along with the mask target seem to be dropping the unreplaced values and I can't see how to remedy this.

基于正则表达式的`str.replace`

这个带有str.replace的正则表达式模式应该很好.

Regex-based with `str.replace`

This regex pattern with str.replace should do nicely.

s.str.replace(r'\.(?=.*?\.)', '')

0    1234.5
1     123.5
2    2345.6
3     678.9
dtype: object

这个想法是，只要要替换的字符更多，就继续替换.这是使用的正则表达式的细分.

The idea is that, as long as there are more characters to replace, keep replacing. Here's a breakdown of the regular expression used.

\.     # '.'
(?=    # positive lookahead
.*?    # match anything
\.     # look for '.'
)

与`np.vectorize`

一起玩

如果要使用count进行此操作，这并非不可能，但这是一个挑战.您可以使用np.vectorize使其更容易.首先，定义一个函数

Fun with `np.vectorize`

If you want to do this using count, it isn't impossible, but it is a challenge. You can make this easier with np.vectorize. First, define a function,

def foo(r, c):
    return r.replace('.', '', c)

矢量化它，

v = np.vectorize(foo)

现在，调用函数v，并传递s和要替换的计数.

Now, call the function v, passing s and the counts to replace.

pd.Series(v(s, s.str.count(r'\.') - 1))

0    1234.5
1     123.5
2    2345.6
3     678.9
dtype: object

请记住，这基本上是一个光荣的循环.

Keep in mind that this is basically a glorified loop.

与vectorize等效的python是

The python equivalent of vectorize would be,

r = []
for x, y in zip(s, s.str.count(r'\.') - 1):
    r.append(x.replace('.', '', y))

pd.Series(r)

0    1234.5
1     123.5
2    2345.6
3     678.9
dtype: object

或者，使用列表推导:

pd.Series([x.replace('.', '', y) for x, y in zip(s, s.str.count(r'\.') - 1)])

0    1234.5
1     123.5
2    2345.6
3     678.9
dtype: object

这篇关于用 pandas 替换字符串中字符的所有但最后一次出现的字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用 pandas 替换字符串中字符的所有但最后一次出现的字符 [英] Replace all but last occurrences of a character in a string with pandas

问题描述

推荐答案

基于正则表达式的`str.replace`

Regex-based with `str.replace`

与`np.vectorize`

Fun with `np.vectorize`

相关文章

Python最新文章

热门教程

热门工具

登录关闭

用 pandas 替换字符串中字符的所有但最后一次出现的字符 [英] Replace all but last occurrences of a character in a string with pandas

问题描述

推荐答案

基于正则表达式的str.replace

Regex-based with str.replace

与np.vectorize

Fun with np.vectorize

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

基于正则表达式的`str.replace`

Regex-based with `str.replace`

与`np.vectorize`

Fun with `np.vectorize`

登录关闭