如何加快此功能? [英] How can I speed this function up?

查看:56
本文介绍了如何加快此功能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这只是一些虚拟代码,模仿真实的

代码中的内容。实际代码是python,它用作第三方应用程序中的脚本语言。应用程序返回的数据结构比数据更多或更少。列表在下面的代码中。对ELEMENT的测试是必要的b $ b ...它只是在每次测试代码时评估为真。在

真正的应用程序中,也许90%的测试也是如此。


所以我的问题是如何加快内部发生的事情

函数write_data()?只允许使用vanilla python(没有psycho或

除了vanilla python安装之外的其他库)。


我有兴趣向同事展示一个python应用程序可以

产生的结果可以与他的C-app相媲美,他觉得这个时间更快了。我想知道在

python语言的限制下我能做些什么才能获得最佳速度。希望有人可以提供帮助。


def write_data1(输出,数据):

for i in data:

if i [0 ]是''ELEMENT'':

out.write("%s%06d"%(i [0],i [1]))

for j in i [2]:

out.write("%d"%(j))

out.write(" \ n")


导入时间


#从第三方应用程序返回的基本数据模拟数据

data = []

for i in range(500000):

data.append((" ELEMENT",i,(1,2,3,4,5,6)))br />

#将数据写入文件

fname =" test2.txt"

out = open(fname,''w' ')

start = timeit.time.clock()

write_data2(out,data)

out.close()

print timeit.time.clock() - start

This is just some dummy code to mimic what''s being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what''s happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I''d like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

def write_data1(out, data):
for i in data:
if i[0] is ''ELEMENT'':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

import timeit

# basic data mimicing data returned from 3rd party app
data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

# write data out to file
fname = "test2.txt"
out = open(fname,''w'')
start= timeit.time.clock()
write_data2(out, data)
out.close()
print timeit.time.clock()-start

推荐答案

Chris写道:
Chris wrote:

这只是一些虚拟代码来模仿真实的内容。

代码。实际代码是python,它用作第三方应用程序中的脚本语言。应用程序返回的数据结构比数据更多或更少。列表在下面的代码中。对ELEMENT的测试是必要的b $ b ...它只是在每次测试代码时评估为真。在

真正的应用程序中,也许90%的测试也是如此。


所以我的问题是如何加快内部发生的事情

函数write_data()?只允许使用vanilla python(没有psycho或

除了vanilla python安装之外的其他库)。


我有兴趣向同事展示一个python应用程序可以

产生的结果可以与他的C-app相媲美,他觉得这个时间更快了。我想知道在

python语言的限制下我能做些什么才能获得最佳速度。希望有人可以提供帮助。


def write_data1(输出,数据):

for i in data:

if i [0 ]是''ELEMENT'':

out.write("%s%06d"%(i [0],i [1]))

for j in i [2]:

out.write("%d"%(j))

out.write(" \ n")


导入时间


#从第三方应用程序返回的基本数据模拟数据

data = []

for i in range(500000):

data.append((" ELEMENT",i,(1,2,3,4,5,6)))br />

#将数据写入文件

fname =" test2.txt"

out = open(fname,''w' ')

start = timeit.time.clock()

write_data2(out,data)

out.close()

print timeit.time.clock() - start

This is just some dummy code to mimic what''s being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what''s happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I''d like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

def write_data1(out, data):
for i in data:
if i[0] is ''ELEMENT'':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

import timeit

# basic data mimicing data returned from 3rd party app
data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

# write data out to file
fname = "test2.txt"
out = open(fname,''w'')
start= timeit.time.clock()
write_data2(out, data)
out.close()
print timeit.time.clock()-start



这个函数我从8.04秒到6.61秒。现在正在对抗
我对python的了解有限。有没有获得更快的机会?


def write_data4(输出,数据):

我的数据:

if i [ 0]是''ELEMENT'':

strx ="%s%06d" %(i [0],i [1])

strx ="" .join([strx] + ["%d"%(j)for j in i [2 ]] +" \ n"])

out.write(strx)

with this function I went from 8.04 s to 6.61 s. Now running up against
my limited knowledge of python. Any chance of getting faster?

def write_data4(out, data):
for i in data:
if i[0] is ''ELEMENT'':
strx = "%s %06d " % (i[0], i[1])
strx="".join([strx] + ["%d " % (j) for j in i[2]] + "\n"])
out.write(strx)




" Chris" < cf ***** @ bigpond.net.auwrote in message

news:ko ******************* @ news-server .bigpond.net.au ...

"Chris" <cf*****@bigpond.net.auwrote in message
news:ko*******************@news-server.bigpond.net.au...

def write_data1(out,data):

for i in data:

如果我[0]是''ELEMENT'':
def write_data1(out, data):
for i in data:
if i[0] is ''ELEMENT'':



测试与''是'相等是有点作弊,因为它是

实现依赖,

但是因为你有一些不公平的约束....

Testing for equality with ''is'' is a bit of a cheat since it is
implementation dependent,
but since you have a somewhat unfair constraint ....


out.write ("%s%06d"%(i [0],i [1]))
out.write("%s %06d " % (i[0], i[1]))



因为i [0]被测试为ELEMENT '',这应该与

out.write(&ELEMENT%06d"%i [1])

相同,这样可以节省构建元组以及插值。

Since i[0] is tested to be "ELEMENT'', this should be the same as
out.write("ELEMENT %06d " % i[1])
which saves constructing a tuple as well as an interpolation.


for j in i [2]:

out.write("%d"%(j))

out.write(" \\\
&quo t;)
for j in i[2]:
out.write("%d " % (j))
out.write("\n")



tjr

tjr


Chris。

我为这个可爱的问题制作了一个简单的测试框架,并尝试了几个修改。我还添加了10%的非元素线你提到的
。首先,你的更新算法并没有真正让我获得比原版更快的结果。我想我的磁盘阵列类似于
隐藏了多次写入惩罚。但我尝试了各种

算法。这是完整的代码:
http: //www.rafb.net/paste/results/ZuW4fK85.html 我的结果(Python 2.4,


[ksh @ lapoire tmp] #python test.py

准备数据......

[write_data1]准备输出文件......

[write_data1]写...

[write_data1]完成10.73秒。

[write_data4]准备输出文件......

[write_data4写作......

[write_data4]完成10.46秒。

[write_data_flush]准备输出文件......

[write_data_flush]写作...

[write_data_flush]在9.09秒内完成。

[write_data_per_line]准备输出文件......

[write_data_per_line]写... 。

[write_data_per_line]在9.71秒内完成。

[write_data_once]准备输出文件......

[write_data_once]写...

[write_data_once]在7.82秒内完成。


我很确定你的措施会有所不同(观察你的结果

你似乎有更快的CPU但更慢的磁盘)。但是你可以选择最适合你的
。我也非常有信心你不能用b $ b来追赶C,因为你可以看到Python的数据结构更加灵活
灵活因此需要更多的处理开销。


问候,

?? ukasz

Hi, Chris.
I made a trivial testing framework for this cute problem and tried a
couple of modifications. I also added the 10% of non-ELEMENT lines you
mentioned. First thing, your updated algorithm didn''t really get me much
faster results than the original. I guess that my disk array sort of
hides the multiple write penalty. But I experimented with various
algorithms. Here''s the code in its entirety:
http://www.rafb.net/paste/results/ZuW4fK85.html My results (Python 2.4,
32bit Fedora Core) were:

[ksh@lapoire tmp]# python test.py
Preparing data...
[write_data1] Preparing output file...
[write_data1] Writing...
[write_data1] Done in 10.73 seconds.
[write_data4] Preparing output file...
[write_data4] Writing...
[write_data4] Done in 10.46 seconds.
[write_data_flush] Preparing output file...
[write_data_flush] Writing...
[write_data_flush] Done in 9.09 seconds.
[write_data_per_line] Preparing output file...
[write_data_per_line] Writing...
[write_data_per_line] Done in 9.71 seconds.
[write_data_once] Preparing output file...
[write_data_once] Writing...
[write_data_once] Done in 7.82 seconds.

I''m pretty sure that your measures will vary (observing your results
you seem to have a faster CPU but slower disk(s)). But you can just take
what works best for you. I''m also quite confident that you won''t be able
to catch up C since as you can see Python''s data structures are far more
flexible and thus require more processing overhead.

Regards,
??ukasz


这篇关于如何加快此功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆