重新开始 [英] re beginner

查看:81
本文介绍了重新开始的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,


我是第一次尝试了解正则表达式,并且非常有用,这对于获得一个例子很有帮助。我有一个旧的(呃)脚本,带有以下

任务 - 需要一个我复制粘贴的字符串,并且格式相同:

hi all,

I''m trying to understand regex for the first time, and it would be very
helpful to get an example. I have an old(er) script with the following
task - takes a string I copy-pasted and wich always has the same format:

打印东西
黄色帽子2蓝色衬衫1

白色袜子4绿色裤子1

蓝色包4漂亮的香水3

手表7手机4

无线线! 2建筑工具3

一个用于钱7用于显示两个4

stuff
''黄色帽子\t2 \t蓝色衬衫\t1 \\ \\ n白袜子\t4 \绿色裤子\t1 \蓝色

bag\t4 \tNice香水\t3 \ nWrist手表\t7 \t手机\t4 \\ \\ n无线

线!\t2 \t建筑工具\t3 \ nOne为了钱#\\t7 \tTwo for the show\t4''


我想把东西放入这样的字典中:print mydict
{''无线电线!'':2,''绿色裤子':1,''蓝色衬衫'': 1,''白色袜子'':

4,''手机'':4,''两个节目'':4,''一个用钱'':7,

''蓝色包'':4,''手表'':7,''漂亮的香水'':3,''黄帽'':2,

''构建工具'':3}


这就是我做的方式:def putindict(items):
.... ite ms = items.replace(''\ n'',''\ t'')

.... items = items.split(''\t'')

.... d = {}

.... for x in xrange(len(items)):

.... if不是项目[x] .isdigit():d [items [x]] = int(items [x + 1])

.... return d
mydict = putindict(stuff )
print stuff Yellow hat 2 Blue shirt 1
White socks 4 Green pants 1
Blue bag 4 Nice perfume 3
Wrist watch 7 Mobile phone 4
Wireless cord! 2 Building tools 3
One for the money 7 Two for the show 4
stuff ''Yellow hat\t2\tBlue shirt\t1\nWhite socks\t4\tGreen pants\t1\nBlue
bag\t4\tNice perfume\t3\nWrist watch\t7\tMobile phone\t4\nWireless
cord!\t2\tBuilding tools\t3\nOne for the money\t7\tTwo for the show\t4''

I want to put items from stuff into a dict like this: print mydict {''Wireless cord!'': 2, ''Green pants'': 1, ''Blue shirt'': 1, ''White socks'':
4, ''Mobile phone'': 4, ''Two for the show'': 4, ''One for the money'': 7,
''Blue bag'': 4, ''Wrist watch'': 7, ''Nice perfume'': 3, ''Yellow hat'': 2,
''Building tools'': 3}

Here''s how I did it: def putindict(items): .... items = items.replace(''\n'', ''\t'')
.... items = items.split(''\t'')
.... d = {}
.... for x in xrange( len(items) ):
.... if not items[x].isdigit(): d[items[x]] = int(items[x+1])
.... return d
mydict = putindict(stuff)



我想知道有没有更好的方法来使用re模块?

perheps甚至避免这个for循环?


谢谢!


I was wondering is there a better way to do it using re module?
perheps even avoiding this for loop?

thanks!

推荐答案

SuperHik写道:
SuperHik wrote:
我在想有没有更好的方法来使用re模块?
perheps甚至避免这个for循环?
I was wondering is there a better way to do it using re module?
perheps even avoiding this for loop?




这是一种在没有RE的情况下做同样事情的方法:


data =''黄色帽子\t2 \蓝色衬衫\ t1 \白色袜子\t4 \绿色

pants\ t1 \\\
nlue bag \t4 \tNice香水\t3 \\\
Wri st watch \t7 \tMobile

phone \t4 \ n无线电话!\t2 \t建筑工具\t3 \ nOne为

money \\ t7t \tTwo for show \t4''


data2 = data.replace(" \ n"," \t")。s​​plit(") \t")

result1 = dict(zip(data2 [:: 2],map(int,data2 [1 :: 2])))


O如果你想变轻:


来自itertools import imap,izip,islice

data2 = data.replace(" \\\
" ,\t,分割(\t)

strings = islice(data2,0,len(data),2)

数字= islice(data2,1,len(数据),2)

result2 = dict(izip(strings,imap(int,numbers)))


再见,

bearophile



This is a way to do the same thing without REs:

data = ''Yellow hat\t2\tBlue shirt\t1\nWhite socks\t4\tGreen
pants\t1\nBlue bag\t4\tNice perfume\t3\nWrist watch\t7\tMobile
phone\t4\nWireless cord!\t2\tBuilding tools\t3\nOne for the
money\t7\tTwo for the show\t4''

data2 = data.replace("\n","\t").split("\t")
result1 = dict( zip(data2[::2], map(int, data2[1::2])) )

O if you want to be light:

from itertools import imap, izip, islice
data2 = data.replace("\n","\t").split("\t")
strings = islice(data2, 0, len(data), 2)
numbers = islice(data2, 1, len(data), 2)
result2 = dict( izip(strings, imap(int, numbers)) )

Bye,
bearophile


你可以编写一个函数来获取一个匹配对象并修改d,

将函数传递给re.sub,并忽略re.sub返回的内容。


#未经测试的鳕鱼e $>
d = {}

def记录(匹配):

s = match.string [match.start():match.end() ]

i = s.index(''\t'')

print s,i#debugging

d [s [:i] ] = int(s [i + 1:])

返回''''

re.sub(''\ w + \\\ + + \\ t'',记录,东西)

#结束代码


它可能会快一点,但它非常迂回,很难

调试。


SuperHik写道:
you could write a function which takes a match object and modifies d,
pass the function to re.sub, and ignore what re.sub returns.

# untested code
d = {}
def record(match):
s = match.string[match.start() : match.end()]
i = s.index(''\t'')
print s, i # debugging
d[s[:i]] = int(s[i+1:])
return ''''
re.sub(''\w+\t\d+\t'', record, stuff)
# end code

it may be a bit faster, but it''s very roundabout and difficult to
debug.

SuperHik wrote:
大家好,

我正在努力理解正则表达式第一次,获得一个例子非常有帮助。我有一个旧的(呃)脚本,带有以下
任务 - 需要一个我复制粘贴的字符串,并且格式相同:
hi all,

I''m trying to understand regex for the first time, and it would be very
helpful to get an example. I have an old(er) script with the following
task - takes a string I copy-pasted and wich always has the same format:
>>>打印东西黄色的帽子2蓝色的衬衫1
白色的袜子4绿色的裤子1
蓝色的包4尼斯香水3
手表7手机4
无线线! 2建筑工具3
一个用于钱7两个用于显示4
>>>东西''黄色的帽子\t2 \t蓝色衬衫\t1 \ n白色袜子\t4 \绿色裤子\t1 \蓝色
包装袋\ t4 \tNice香水\t3 \ nWrist watch \t7 \tMobile phone \t4 \\\
Wireless
cord!\t2 \tBuilding tools \t3 \ nOne for the money\t7 \tTwo for the show\t4''

我想将东西中的物品放入这样的字典中:>>>打印mydict {''无线线!'':2,''绿色裤子'':1,''蓝色衬衫'':1,''白色袜子'':
4,''手机''' :4,''两个节目'':4,''一个用钱'':7,
''蓝色包'':4,''手表'':7,''不错香水'':3,''黄帽'':2,
''建立工具'':3}

这就是我做的方式:>>> ; def putindict(items):... items = items.replace(''\ n'',''\ t'')
... items = items.split(''\ t' ')
... d = {}
... for x in xrange(len(items)):
... if if items items [x] .isdigit():d [items [x]] = int(items [x + 1])
... return d>>>
>>> mydict = putindict(stuff)
>>> print stuff Yellow hat 2 Blue shirt 1
White socks 4 Green pants 1
Blue bag 4 Nice perfume 3
Wrist watch 7 Mobile phone 4
Wireless cord! 2 Building tools 3
One for the money 7 Two for the show 4
>>> stuff ''Yellow hat\t2\tBlue shirt\t1\nWhite socks\t4\tGreen pants\t1\nBlue
bag\t4\tNice perfume\t3\nWrist watch\t7\tMobile phone\t4\nWireless
cord!\t2\tBuilding tools\t3\nOne for the money\t7\tTwo for the show\t4''

I want to put items from stuff into a dict like this: >>> print mydict {''Wireless cord!'': 2, ''Green pants'': 1, ''Blue shirt'': 1, ''White socks'':
4, ''Mobile phone'': 4, ''Two for the show'': 4, ''One for the money'': 7,
''Blue bag'': 4, ''Wrist watch'': 7, ''Nice perfume'': 3, ''Yellow hat'': 2,
''Building tools'': 3}

Here''s how I did it: >>> def putindict(items): ... items = items.replace(''\n'', ''\t'')
... items = items.split(''\t'')
... d = {}
... for x in xrange( len(items) ):
... if not items[x].isdigit(): d[items[x]] = int(items[x+1])
... return d >>>
>>> mydict = putindict(stuff)



我想知道有没有更好的方法来使用re模块?
perheps甚至避免这个for循环?

谢谢!


I was wondering is there a better way to do it using re module?
perheps even avoiding this for loop?

thanks!






> strings = islice(data2,0,len(data),2)
> strings = islice(data2, 0, len(data), 2)
numbers = islice(data2,1,len(data),2)
numbers = islice(data2, 1, len(data), 2)




这可能必须是:


strings = islice(data2,0,len(data2),2)

numbers = islice(data2 ,1,len(data2),2)


对不起,

。熊市



This probably has to be:

strings = islice(data2, 0, len(data2), 2)
numbers = islice(data2, 1, len(data2), 2)

Sorry,
bearophile


这篇关于重新开始的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆