将正则表达式应用于数据框所有列的Python方法 [英] Pythonic way of applying regex to all columns of dataframe

查看：197 发布时间：2020/5/24 1:00:25 python regex pandas

本文介绍了将正则表达式应用于数据框所有列的Python方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据框，在所有列中都包含关键字和值.请参见下面的示例.

I have a dataframe containing keywords and value in all columns. See the example below.

我想将正则表达式应用于所有列.因此，我使用for循环并应用正则表达式:

I want to apply regex to all the columns. So I use for loop and apply the regex:

for i in range (1,maxExtended_Keywords):
    temp = 'extdkey_' + str(i)
    Extended_Keywords[temp] = Extended_Keywords[temp].str.extract(":(.*)",expand=True)

然后我得到了想要的最终结果.那里没有问题.

And I get the desired final result. No issues there.

但是，只是古玩才有一种将正则表达式应用到整个数据框的Python方式，而不是使用for循环和逐列应用.

However, just curios is there a pythonic way to apply regex to entire dataframe instead of using for loop and applying to column wise.

谢谢

推荐答案

将pandas.DataFrame.replace与regex=True

df.replace('^.*:\s*(.*)', r'\1', regex=True)

请注意，我的模式使用括号捕获':'之后的部分，并使用原始字符串r'\1'引用该捕获组.

Notice that my pattern uses parentheses to capture the part after the ':' and uses a raw string r'\1' to reference that capture group.

df = pd.DataFrame([
    [np.nan, 'thing1: hello'],
    ['thing2: world', np.nan]
], columns=['extdkey1', 'extdkey2'])

df

        extdkey1       extdkey2
0            NaN  thing1: hello
1  thing2: world            NaN

df.replace('^.*:\s*(.*)', r'\1', regex=True)

  extdkey1 extdkey2
0      NaN    hello
1    world      NaN

这篇关于将正则表达式应用于数据框所有列的Python方法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将正则表达式应用于数据框所有列的Python方法 [英] Pythonic way of applying regex to all columns of dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将正则表达式应用于数据框所有列的Python方法 [英] Pythonic way of applying regex to all columns of dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭