请帮助它更快 [英] help make it faster please

查看:34
本文介绍了请帮助它更快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了这个函数,它执行以下操作:

从文件读取行后。它分裂并通过哈希表找到单词出现

...原因这个很慢..可以有一个

帮我把它变得更快......

f = open(文件名)

lines = f .readlines()

def create_words(lines):

cnt = 0

spl_set =''[",;<> {} _&?!(): - [\。= + * \t\\\
\ r] +''

代表行内容:

words = content.split()

countDict = {}

wordlist = []

for w in words:

w = string.lower(w)

如果在spl_set中w [-1]:w = w [: - 1]

如果w!='''' ':

if countDict.has_key(w):

countDict [w] = countDict [w] +1

else:

countDict [w] = 1

wordlist = countDict.keys()

wordlist.sort()

cnt + = 1

if countDict!= {}:

word word word:print(word + '''+

str(countDict [word])+''\ n'')

I wrote this function which does the following:
after readling lines from file.It splits and finds the word occurences
through a hash table...for some reason this is quite slow..can some one
help me make it faster...
f = open(filename)
lines = f.readlines()
def create_words(lines):
cnt = 0
spl_set = ''[",;<>{}_&?!():-[\.=+*\t\n\r]+''
for content in lines:
words=content.split()
countDict={}
wordlist = []
for w in words:
w=string.lower(w)
if w[-1] in spl_set: w = w[:-1]
if w != '''':
if countDict.has_key(w):
countDict[w]=countDict[w]+1
else:
countDict[w]=1
wordlist = countDict.keys()
wordlist.sort()
cnt += 1
if countDict != {}:
for word in wordlist: print (word+'' ''+
str(countDict[word])+''\n'')

推荐答案

为什么要重新加载wordlist并在每个文字处理后对其进行排序?似乎

可以在for循环之后完成。

pk ** **** @ gmail.com 写道:
why reload wordlist and sort it after each word processing ? seems that
it can be done after the for loop.

pk******@gmail.com wrote:
我写了这个函数,它执行以下操作:
从文件读取行后。它分裂并找到单词occurences
通过哈希表...由于某种原因,这是非常慢..可以帮助我让它更快......
f = open(文件名)
lines = f .readlines()
def create_words(lines):
cnt = 0
spl_set =''[" ,;<> {} _&?!(): - [\\ \\。= + * \\\\] +''
对于行内容:
words = content.split()
countDict = {}
wordlist = []
for w in words:
w = string.lower(w)
如果在spl_set中w [-1]:w = w [: - 1]
如果w!='''':
如果countDict.has_key(w):
countDict [w] = countDict [w] +1
否则:
countDict [w] = 1
wordlist = countDict.keys()
wordlist.sort()
cnt + = 1
如果countDict!= {} :
wordlist中的单词:print(word +''''+
str(countDict [word])+''\ n'')
I wrote this function which does the following:
after readling lines from file.It splits and finds the word occurences
through a hash table...for some reason this is quite slow..can some one
help me make it faster...
f = open(filename)
lines = f.readlines()
def create_words(lines):
cnt = 0
spl_set = ''[",;<>{}_&?!():-[\.=+*\t\n\r]+''
for content in lines:
words=content.split()
countDict={}
wordlist = []
for w in words:
w=string.lower(w)
if w[-1] in spl_set: w = w[:-1]
if w != '''':
if countDict.has_key(w):
countDict[w]=countDict[w]+1
else:
countDict[w]=1
wordlist = countDict.keys()
wordlist.sort()
cnt += 1
if countDict != {}:
for word in wordlist: print (word+'' ''+
str(countDict[word])+''\n'')






实际上我为每个所谓的行创建了一个单独的单词列表。这一行

我的意思是将来会是一个段落...所以我将不得不重新创建每个循环的

单词表

Actually I create a seperate wordlist for each so called line.Here line
I mean would be a paragraph in future...so I will have to recreate the
wordlist for each loop


哦对不起缩进在这里搞砸了...

wordlist = countDict.keys()

wordlist.sort()

应该在单词循环之外....现在

def create_words( ():
cnt = 0

spl_set =''[",;<> {} _&?!(): - [\。= + * \\\\] +''
行内容


words = content.split()

countDict = {}

wordlist = []

for w in words:

w = string.lower(w)

if w [-1]在spl_set中:w = w [: - 1]

如果w!='''':

if countDict.has_key(w):

countDict [w] = countDict [w] +1

else:

countDict [w] = 1

wordlist = countDict.keys ()

wordlist.sort()

cnt + = 1

if countDict!= {}:

wordlist中的单词:print(word +''''+

str(countDict [word])+''\ n'')


ok现在这是我要问的正确问题......

Oh sorry indentation was messed here...the
wordlist = countDict.keys()
wordlist.sort()
should be outside the word loop.... now
def create_words(lines):
cnt = 0
spl_set = ''[",;<>{}_&?!():-[\.=+*\t\n\r]+''
for content in lines:
words=content.split()
countDict={}
wordlist = []
for w in words:
w=string.lower(w)
if w[-1] in spl_set: w = w[:-1]
if w != '''':
if countDict.has_key(w):
countDict[w]=countDict[w]+1
else:
countDict[w]=1
wordlist = countDict.keys()
wordlist.sort()
cnt += 1
if countDict != {}:
for word in wordlist: print (word+'' ''+
str(countDict[word])+''\n'')

ok now this is the correct question I am asking...


这篇关于请帮助它更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆