在 Python 中使用 Re 删除双空格/制表符组合 [英] Remove double space/tab combinations using Re in Python

查看:67
本文介绍了在 Python 中使用 Re 删除双空格/制表符组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 Re 模块将 Python 中存在连续制表符和/或空格的所有实例替换为单个空格.我不想删除新行(排除 \s 推荐).目前我有:

I want to replace all instances where there are consecutive tabs and/or spaces in Python with a single space, using the Re module. I do not want to remove new lines (which rules out the \s commend). At the moment I have:

    formateed_string = re.sub("\t+" , " ", formateed_string)            
    formateed_string = re.sub(" +" , " ", formateed_string)         
    formateed_string = re.sub("\t " , " ", formateed_string)    
    formateed_string = re.sub(" \t" , " ", formateed_string)

即这首先检查连续的空格,然后是连续的制表符,然后是制表符/空格,然后是空格/制表符.这似乎通常有效,但偶尔会留下双倍空格(我猜这意味着上面没有完全删除的标签/空格存在异常污染).

i.e this first checks for consecutive spaces, then consecutive tabs, then tab/space, then space/tab. this seems to normally work, however occasionally leaves behind a double-space (which I guess means there are unusual contamination of tabs/spaces which the above does not fully remove).

是否有一种简单/更优雅的方法来实现这一目标?

Is there a simple/more elegant way of achieving this?

[n.b.运行 Python 2.7]

[n.b. running Python 2.7]

推荐答案

下面的正则表达式将用单个空格替换连续的制表符或空格.请注意,它不会将单个选项卡转换为空格.

The below regex would replace consecutive tabs or spaces with a single whitespace. Note that, it won't convert a single tab into a space.

formatted_string = re.sub("[\t ]{2,}", " ", formatted_string)

这篇关于在 Python 中使用 Re 删除双空格/制表符组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆