正则表达式 - 旧正则表达式模块与重新模块 [英] Regular Expression - old regex module vs. re module

查看：71 发布时间：2019/6/6 15:06:56 python

本文介绍了正则表达式 - 旧正则表达式模块与重新模块的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

大家好，

我很难将以下regex.compile模式转换为新的re.compile格式。

regsub.sub（）与re.sub（）之间也存在差异。

任何人都可以伸出援助之手吗？

import regsub

import regex

import re＃<<需要转换到这个模块

.....

"""将perl样式格式符号系统转换为printf标记。

取一个字符串并用perl风格替换计算的printf标记

格式符号系统。

例如：

###。##产量％6.2f

########产量％8d

<< ;<<<产量％-5s

"""

exponentPattern = regex.compile（''\（^ \ | [^ \\＃] \\ \\）\（＃+ \。＃+ \ * \ * \ * \ * \）''）

floatPattern = regex.compile（''\（^ \ | [^ \\＃] \）\（＃+ \。＃+ \）''）

integerPattern = regex.compile（''\（^ \ | [^ \\＃] \）\（## + \）''）

leftJustifiedStringPattern = regex.compile（''\（^ \ | [ ^ \\<] \）\（<<< + \）''）

rightJustifiedStringPattern = regex.compile（''\（^ \ | [^ \\>] \）\（>> + \）''）

而1：＃处理所有整数字段

print（" Testing Integer）

if integerPattern.search（s）< 0：break

print（" Integer Match："，integerPattern.search（s）.span（））

＃i1，i2 = integerPattern.regs [2 ]

i1，i2 = integerPattern.search（s）.span（）

width_total = i2 - i1

f =''％'' +`width_total` +''d''

#s = regsub.sub（integerPattern，''\\\\''+ f，s）

s = integerPattern.sub（f，s）

提前致谢！

史蒂夫

Hi All,

I''m having a tough time converting the following regex.compile patterns
into the new re.compile format. There is also a differences in the
regsub.sub() vs. re.sub()

Could anyone lend a hand?
import regsub
import regex

import re # << need conversion to this module

.....

"""Convert perl style format symbology to printf tokens.

Take a string and substitute computed printf tokens for perl style
format symbology.

For example:

###.## yields %6.2f
######## yields %8d
<<<<< yields %-5s
"""
exponentPattern = regex.compile(''\(^\|[^\\#]\)\(#+\.#+\*\*\*\*\)'')
floatPattern = regex.compile(''\(^\|[^\\#]\)\(#+\.#+\)'')
integerPattern = regex.compile(''\(^\|[^\\#]\)\(##+\)'')
leftJustifiedStringPattern = regex.compile(''\(^\|[^\\<]\)\(<<+\)'')
rightJustifiedStringPattern = regex.compile(''\(^\|[^\\>]\)\(>>+\)'')

while 1: # process all integer fields
print("Testing Integer")
if integerPattern.search(s) < 0: break
print("Integer Match : ", integerPattern.search(s).span() )
# i1 , i2 = integerPattern.regs[2]
i1 , i2 = integerPattern.search(s).span()
width_total = i2 - i1
f = ''%''+`width_total`+''d''
# s = regsub.sub(integerPattern, ''\\1''+f, s)
s = integerPattern.sub(f, s)

Thanks in advance!

Steve

推荐答案

在文章< 11 ********************** @ d56g2000cwd.googlegroups .com> ;,

Steve< st **** @ cruzio.com>写道：

In article <11**********************@d56g2000cwd.googlegroups .com>,
Steve <st****@cruzio.com> wrote:

大家好，

我很难将以下regex.compile模式转换为新的re.compile格式。
regsub.sub（）与re.sub（）之间也存在差异

任何人都可以伸出援助之手吗？

import regsub
import regex

import re＃<<需要转换到此模块

....

"""将perl样式格式符号系统转换为printf标记。

字符串和替换计算的printf标记用于perl样式
格式符号系统。

例如：

###。## yield％6.2f
＃ #######产生％8d
<<<<< yield％-5s
"""

Hi All,

I''m having a tough time converting the following regex.compile patterns
into the new re.compile format. There is also a differences in the
regsub.sub() vs. re.sub()

Could anyone lend a hand?
import regsub
import regex

import re # << need conversion to this module

....

"""Convert perl style format symbology to printf tokens.

Take a string and substitute computed printf tokens for perl style
format symbology.

For example:

###.## yields %6.2f
######## yields %8d
<<<<< yields %-5s
"""

也许不是最优的，但这可以按要求处理。注意

所有浮点数必须在替换任何整数模式之前完成。

=============== ===========

＃！/ usr / local / bin / python

import re

"""将perl样式格式符号系统转换为printf标记。

取一个字符串并用perl样式替换计算的printf标记

格式符号系统。

例如：

###。## yield％6.2f

##### ###产量％8d

<<<<<产量％-5s

"""

＃处理没有整数或没有小数字符的情况

floatPattern = re.compile（r''（？<！\\）（＃+ \。（＃*）| \。（＃+））''）

integerPattern = re .compile（r''（？<！[\\。]）（＃+）（？！[。＃]）''）

leftJustifiedStringPattern = re.compile（r' '（？<！\\）（< +）''）

rightJustifiedStringPattern = re.compile（r''（？<！\\）（> + ）''）

def float_sub（matchobj）：

＃fractional part可能在（）[1]或groups（）[2]中

如果matchobj.groups（）[1]不是None：

return" %%% d。％df" ％（len（matchobj.groups（）[0]），

len（matchobj.groups（）[1]））

else：

return" %%% d。％df" ％（len（matchobj.groups（）[0]），

len（matchobj.groups（）[2]））

def unperl_format（s）：

changed_things = 1

而change_things：

#lather，冲洗并重复直到没有新的事情发生

changed_things = 0

mat_obj = leftJustifiedStringPattern.search（s）

如果mat_obj：

s = re.sub（leftJustifiedStringPattern，" %% - ％ds"％

len（mat_obj.groups（）[0]），s，1）

changed_things = 1

mat_obj = rightJustifiedStringPattern.search（s）

if mat_obj：

s = re.sub（rightJustifiedStringPattern，" %%% ds"％

len（mat_obj.groups（）[0]），s，1）

changed_things = 1

＃必须在整体之前完成所有浮动

mat_obj = floatPattern.search（s）

如果mat_obj：

s = re.sub（floatPattern，float_sub，s，1）

changed_things = 1

＃不要落入国内代码

继续

mat_obj = integerPattern.search（s）

如果mat_obj：

s = re.sub（integerPattern，" %%% dd" ; ％len（mat_obj.groups（）[0]），

s，1）

changed_things = 1

返回s

if __name__ ==''__ main__''：

testarray = [" integer：####，integer＃integer at end＃"，

" float ####。## no decimals ###。 no int。### at end ###。"，

" Left string<<<<<<<短左字符串<"，

" right string>>>>>>短右字符串>"，

" escaped chars \\ #### \\ ####。## \\< \\< ;<<<在testarray中为s的
：

print（" Testing：％s"）&b;<<<" ;％s）

print"结果：％s" ％unperl_format（s）

打印

======================

运行此项给出

测试：整数：####，整数结束时整数＃

结果：整数：％4d，整数％1d整数结束％1d

测试：浮动####。##无小数###。 no int。### at end ###。

结果：float％7.2f无小数％4.0f no int％4.3f at end％4.0f

测试：左字符串<<<<<<<短左字符串<

结果：左字符串％-6s短左字符串％-1s

测试：右字符串>>>> >>短右字符串>

结果：右字符串％6s短右字符串％1s

测试：转义字符\ #### \ ## ##。## \< \<<<< \> \><<<

结果：转义字符\＃％3d \＃％6.2f \< \<％ - 3s \ > \>％ - 3s

-

Jim Segrave（je*@jes-2.demon.nl）

Perhaps not optimal, but this processes things as requested. Note that
all floats have to be done before any integer patterns are replaced.

==========================
#!/usr/local/bin/python

import re

"""Convert perl style format symbology to printf tokens.
Take a string and substitute computed printf tokens for perl style
format symbology.

For example:

###.## yields %6.2f
######## yields %8d
<<<<< yields %-5s
"""
# handle cases where there''s no integer or no fractional chars
floatPattern = re.compile(r''(?<!\\)(#+\.(#*)|\.(#+))'')
integerPattern = re.compile(r''(?<![\\.])(#+)(?![.#])'')
leftJustifiedStringPattern = re.compile(r''(?<!\\)(<+)'')
rightJustifiedStringPattern = re.compile(r''(?<!\\)(>+)'')

def float_sub(matchobj):
# fractional part may be in either groups()[1] or groups()[2]
if matchobj.groups()[1] is not None:
return "%%%d.%df" % (len(matchobj.groups()[0]),
len(matchobj.groups()[1]))
else:
return "%%%d.%df" % (len(matchobj.groups()[0]),
len(matchobj.groups()[2]))
def unperl_format(s):
changed_things = 1
while changed_things:
# lather, rinse and repeat until nothing new happens
changed_things = 0

mat_obj = leftJustifiedStringPattern.search(s)
if mat_obj:
s = re.sub(leftJustifiedStringPattern, "%%-%ds" %
len(mat_obj.groups()[0]), s, 1)
changed_things = 1

mat_obj = rightJustifiedStringPattern.search(s)
if mat_obj:
s = re.sub(rightJustifiedStringPattern, "%%%ds" %
len(mat_obj.groups()[0]), s, 1)
changed_things = 1

# must do all floats before ints
mat_obj = floatPattern.search(s)
if mat_obj:
s = re.sub(floatPattern, float_sub, s, 1)
changed_things = 1
# don''t fall through to the int code
continue

mat_obj = integerPattern.search(s)
if mat_obj:
s = re.sub(integerPattern, "%%%dd" % len(mat_obj.groups()[0]),
s, 1)
changed_things = 1
return s

if __name__ == ''__main__'':
testarray = ["integer: ####, integer # integer at end #",
"float ####.## no decimals ###. no int .### at end ###.",
"Left string <<<<<< short left string <",
"right string >>>>>> short right string >",
"escaped chars \\#### \\####.## \\<\\<<<< \\>\\><<<"]
for s in testarray:
print("Testing: %s" % s)
print "Result: %s" % unperl_format(s)
print

======================

Running this gives

Testing: integer: ####, integer # integer at end #
Result: integer: %4d, integer %1d integer at end %1d

Testing: float ####.## no decimals ###. no int .### at end ###.
Result: float %7.2f no decimals %4.0f no int %4.3f at end %4.0f

Testing: Left string <<<<<< short left string <
Result: Left string %-6s short left string %-1s

Testing: right string >>>>>> short right string >
Result: right string %6s short right string %1s

Testing: escaped chars \#### \####.## \<\<<<< \>\><<<
Result: escaped chars \#%3d \#%6.2f \<\<%-3s \>\>%-3s

--
Jim Segrave (je*@jes-2.demon.nl)

" Steve" < ST **** @ cruzio.com>在消息中写道

news：11 ********************** @ d56g2000cwd.googlegr oups.com ...

"Steve" <st****@cruzio.com> wrote in message
news:11**********************@d56g2000cwd.googlegr oups.com...

大家好，

我很难将以下regex.compile模式转换为新的re.compile格式。
regsub.sub（）与re.sub（）之间也存在差异

任何人都可以伸出援手吗？

Hi All,

I''m having a tough time converting the following regex.compile patterns
into the new re.compile format. There is also a differences in the
regsub.sub() vs. re.sub()

Could anyone lend a hand?

不是一个重新解决方案，但是pyparsing使得一个易于理解的程序。

TransformString只需要扫描一次字符串 -

reals-before-ints测试是

格式化程序变量的定义。

Pyparsing'的项目wiki位于 http://pyparsing.wikispaces.com 。

- Paul

-------------------

来自pyparsing import *

" ;"

读取Perl样式的格式化占位符并替换为

正确的Python％x字符串interp格式化程序

＃ ##### - > ％6d

##。### - > ％6.3f

<<<<< - > ％-5s

Not an re solution, but pyparsing makes for an easy-to-follow program.
TransformString only needs to scan through the string once - the
"reals-before-ints" testing is factored into the definition of the
formatters variable.

Pyparsing''s project wiki is at http://pyparsing.wikispaces.com.

-- Paul

-------------------
from pyparsing import *

"""
read Perl-style formatting placeholders and replace with
proper Python %x string interp formatters

###### -> %6d
##.### -> %6.3f
<<<<< -> %-5s

> - > ％5s

> -> %5s

"""

＃设置模式为匹配 - Word对象匹配字符组

＃由Word构造函数中的字符组成;结合力

＃元素相邻，没有介入的空白

＃（注意在realFormat中使用结果名称，以便于访问

＃小数位子串）

intFormat = Word（&＃;"）

realFormat =组合（Word（&＃;"）+"。" +

Word（＃）。setResultsName（" decPlaces"））

leftString = Word（"<"）

rightString = Word（">"）

#define每个解析操作 - 匹配的令牌是第三个

#arg解析操作;解析操作将用解析操作返回的
＃值替换传入的令牌

intFormat.setParseAction（lambda s，l，toks：" %%% dd"％ len（toks [0]））

realFormat.setParseAction（lambda s，l，toks：" %%% d。％df"％

（len（toks） [0]），len（toks.decPlaces）））

leftString.setParseAction（lambda s，l，toks：" %% - ％ds"％len（toks [0]））

rightString.setParseAction（lambda s，l，toks：" %%% ds"％len（toks [0]））

#collection所有格式化程序单个语法

＃ - 注意实数在整齐前检查

formatters = rightString | leftString | realFormat | intFormat

＃设置我们的测试字符串，并使用转换字符串调用解析操作

＃对任何匹配的标记

testString ="""

这是一个包含

整数的字符串：#### ################

浮动：#####。####。######＃。＃

左对齐字符串：<<<< ;<<<< << <

右对齐字符串：>>>>>>>>>>> >> >

int句末：####。

"""

print formatters.transformString（testString）

-------------------

打印：

这是一个包含

整数的字符串：％4d％1d％15d

浮点数：％7.1f％10.6f％3.1f

左对齐字符串：％-8s％-2s％-1s

右对齐字符串：％10s％2s％1s

句末：％4d 。

"""

# set up patterns to be matched - Word objects match character groups
# made up of characters in the Word constructor; Combine forces
# elements to be adjacent with no intervening whitespace
# (note use of results name in realFormat, for easy access to
# decimal places substring)
intFormat = Word("#")
realFormat = Combine(Word("#")+"."+
Word("#").setResultsName("decPlaces"))
leftString = Word("<")
rightString = Word(">")

# define parse actions for each - the matched tokens are the third
# arg to parse actions; parse actions will replace the incoming tokens with
# value returned from the parse action
intFormat.setParseAction( lambda s,l,toks: "%%%dd" % len(toks[0]) )
realFormat.setParseAction( lambda s,l,toks: "%%%d.%df" %
(len(toks[0]),len(toks.decPlaces)) )
leftString.setParseAction( lambda s,l,toks: "%%-%ds" % len(toks[0]) )
rightString.setParseAction( lambda s,l,toks: "%%%ds" % len(toks[0]) )

# collect all formatters into a single "grammar"
# - note reals are checked before ints
formatters = rightString | leftString | realFormat | intFormat

# set up our test string, and use transform string to invoke parse actions
# on any matched tokens
testString = """
This is a string with
ints: #### # ###############
floats: #####.# ###.###### #.#
left-justified strings: <<<<<<<< << <
right-justified strings: >>>>>>>>>> >> >
int at end of sentence: ####.
"""
print formatters.transformString( testString )

-------------------
Prints:

This is a string with
ints: %4d %1d %15d
floats: %7.1f %10.6f %3.1f
left-justified strings: %-8s %-2s %-1s
right-justified strings: %10s %2s %1s
int at end of sentence: %4d.

文章< eP **************** @ tornado.texas.rr.com> ，

Paul McGuire< pt *** @ austin.rr._bogus_.com>写道：

In article <eP****************@tornado.texas.rr.com>,
Paul McGuire <pt***@austin.rr._bogus_.com> wrote:

不是一个重新解决方案，但pyparsing使一个易于遵循的程序。
TransformString只需要扫描一次字符串 -
"实数先于整数"测试是
格式化程序变量定义的因素。

Pyparsing'的项目维基位于 http://pyparsing.wikispaces.com 。

Not an re solution, but pyparsing makes for an easy-to-follow program.
TransformString only needs to scan through the string once - the
"reals-before-ints" testing is factored into the definition of the
formatters variable.

Pyparsing''s project wiki is at http://pyparsing.wikispaces.com.

如果指定为###的浮动失败。或。###，它分别输出一个整数

格式和小数点。它也忽略了\＃

应该阻止''''被包含在格式中。

-

Jim Segrave（je*@jes-2.demon.nl）

If fails for floats specified as ###. or .###, it outputs an integer
format and the decimal point separately. It also ignores \# which
should prevent the ''#'' from being included in a format.

--
Jim Segrave (je*@jes-2.demon.nl)

这篇关于正则表达式 - 旧正则表达式模块与重新模块的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

正则表达式 - 旧正则表达式模块与重新模块 [英] Regular Expression - old regex module vs. re module

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

正则表达式 - 旧正则表达式模块与重新模块 [英] Regular Expression - old regex module vs. re module

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭