改善了巨大的双重周期 [英] improving a huge double-for cycle
问题描述
你好那里:),
我是一个python新手,需要在
外部模拟程序中运行以下代码来执行任务ABAQUS"它使用python
来访问某个几何模型的网格(带有xy坐标的节点)。
[IN是包含要检查的节点的起始输入,有
一些具有相同x和y坐标的双节点,需要移除
。 SN是包含这样的双节点的输出]
代码:在范围内选择全部
(len(IN)):#scan所有元素列表IN
为范围内的j(len(IN)):
如果i< j:
如果IN [i] .coordinates [0] == IN [j] .coordinates [0]:
如果IN [i] .coordinates [1] == IN [j] .coordinates [1]:
SN.append(IN [i] .label)
不幸的是我的len(IN)大约是100.000,运行时间大约是
15h! ! :(
有任何改进的想法吗?
我已经尝试将if语句分组到一个单词中:
代码:选择全部
如果i< j并且如果IN [i] .coordinates [0] == IN [j] .coordinates [0]和
如果IN [i] .coordinates [1] == IN [j] .coordinates [1]:
但没有改进。
>
非常感谢,Alex
Hello there :) ,
I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.
[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]
Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <j:
if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:
SN.append(IN[i].label)
Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!! :(
Any idea to improve it?
I have already tried to group the "if statements" in a single one:
Code: Select all
if i <j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
if IN[i].coordinates[1] == IN[j].coordinates[1]:
but no improvements.
Many thanks, Alex
推荐答案
Alex不幸的是我的len(IN)大约是100.000并且正在运行时间
Alexabout 15h !!!! :(
AlexAny想改善它吗?
numpy?< >http://numpy.scipy.org/
http://www.scipy.org/Numpy_Example_List
更多请注意,您正在通过内循环建立每个
时间的len(IN)int列表。快速h这可能是这个简单的改变:
索引=范围(len(IN))
索引中的i:#scan列表中的所有元素IN <对于索引中的j,
:
if i!= j:
if(IN [i] .coordinates [0] == IN [j ] .coordinates [0]和
IN [i] .coordinates [1] == IN [j] .coordinates [1]):
SN.append(IN [i] .label)
Skip
AlexUnfortunately my len(IN) is about 100.000 and the running time
Alexabout 15h !!!! :(
AlexAny idea to improve it?
numpy?
http://numpy.scipy.org/
http://www.scipy.org/Numpy_Example_List
More immediately, note that you are building a list of len(IN) ints every
time through the inner loop. A quick hit might be this simple change:
indexes = range(len(IN))
for i in indexes: #scan all elements of the list IN
for j in indexes:
if i != j:
if (IN[i].coordinates[0] == IN[j].coordinates[0] and
IN[i].coordinates[1] == IN[j].coordinates[1]):
SN.append(IN[i].label)
Skip
Alexzive写道:
Alexzive wrote:
你好那里:),
我是一个python新手,需要在
外部模拟程序中运行以下代码来执行任务ABAQUS"它使用python
来访问某个几何模型的网格(带有xy坐标的节点)。
[IN是包含要检查的节点的起始输入,有
一些具有相同x和y坐标的双节点,需要移除
。 SN是包含这样的双节点的输出]
代码:在范围内选择全部
(len(IN)):#scan所有元素列表IN
为范围内的j(len(IN)):
如果i< j:
如果IN [i] .coordinates [0] == IN [j] .coordinates [0]:
如果IN [i] .coordinates [1] == IN [j] .coordinates [1]:
SN.append(IN [i] .label)
不幸的是我的len(IN)大约是100.000,运行时间大约是
15h! ! :(
Hello there :) ,
I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.
[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]
Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <j:
if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:
SN.append(IN[i].label)
Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!! :(
有任何改进的想法吗?
我已经尝试对if进行分组单句中的陈述:
代码:选择全部
如果i< j并且如果IN [i] .coordinates [0] == IN [j]。坐标[0]和
如果IN [i] .coordinates [1] == IN [j] .coordinates [1]:
但没有任何改进。
非常感谢,Alex
Any idea to improve it?
I have already tried to group the "if statements" in a single one:
Code: Select all
if i <j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
if IN[i].coordinates[1] == IN[j].coordinates[1]:
but no improvements.
Many thanks, Alex
当你在寻找重复项时,可能会有一个有效的解决方案基于set或dict对象获得
。
#未经测试
来自集合的
导入defaultdict
groups = defaultdict(list)
IN中的项目
:
c = item.coordinates
groups [c [0 ],c [1]]。append(item.label)
SN = []
for groups.itervalues():
如果len(标签)1:
SN.extend(标签)#或标签[1:]如果你想保留一件物品
宠物er
When you''re looking for duplicates an efficient solution is likely to be
based on a set or dict object.
# untested
from collections import defaultdict
groups = defaultdict(list)
for item in IN:
c = item.coordinates
groups[c[0], c[1]].append(item.label)
SN = []
for labels in groups.itervalues():
if len(labels) 1:
SN.extend(labels) # or labels[1:] if you want to keep one item
Peter
代码:选择所有
Code: Select all
for i in range(len(IN)):#scan all列表中的元素IN
范围内的j(len(IN)):
如果i< j:
如果IN [i ] .coordinates [0] == IN [j] .coordinates [0]:
如果IN [i] .coordinates [1] == IN [j] .coordinates [1]:
SN.append(IN [i] .label)
不幸的是我的len(IN)大约是100.000,运行时间大约是
15h !!!! :(
有任何改进的想法吗?
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <j:
if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:
SN.append(IN[i].label)
Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!! :(
Any idea to improve it?
[snip]
[snip]
我已经尝试将if语句分组在一个单词中:
代码:全选
if i< j如果IN [i]。坐标[0] == IN [j]。坐标[0]和
如果IN [i] .coordinates [1] == IN [j] .coordinates [1]:
但没有改进。
I have already tried to group the "if statements" in a single one:
Code: Select all
if i <j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
if IN[i].coordinates[1] == IN[j].coordinates[1]:
but no improvements.
这就像重新安排泰坦尼克号上的躺椅:)是的,它可能会提供加速,但是等待15小时的时间是3秒。
不知道len( IN [x] .coordinates)或他们的结构,如果
它是len == 2的列表,你应该可以做到
如果我< j和IN [i] .coordinates == IN [j]。坐标
或
如果我< j和IN [i] .coordinates [:2] == IN [j] .coordinates [:2]
然而,再次,这只是波兰。最大的问题是你有
有一个O(N ^ 2)算法可以杀死你。
1)使用xrange代替范围来使用巨大的
不需要的数组来节省内存。
2)除非你需要附加重复的标签,你知道当
我和J交换了,你会再次达到相同的条件,所以它可能值得编写外部循环来消除这个
场景,并在过程,但只是从i + 1开始,而不是我,你可以放弃支票,如果i<> j。
这样的变化在xrange(len(IN))中,对于i,可能看起来像是
:
对于xrange中的j,
(i + 1,len(IN)):
如果IN [i] .coordinates == IN [j]。坐标:
SN.append(IN [i] .label)
>
如果我的大学算法记忆足够给我,这个
会将你的O(N ^ 2)减少到O(N log N),这将使你获得一些
$ b $节省下来的时间。
/>
-tkc
It''s like rearranging deck-chairs on the Titanic :) Yes, it may
give a speed up, but what''s 3 seconds when you''re waiting 15hr :)
Not knowing the len(IN[x].coordinates) or their structure, if
it''s a list of len==2, you should be able to just do
if i <j and IN[i].coordinates == IN[j].coordinates
or
if i <j and IN[i].coordinates[:2] == IN[j].coordinates[:2]
However, again, this is just polish. The big problem is that you
have an O(N^2) algorithm that''s killing you.
1) use xrange instead of range to save eating memory with a huge
unneeded array.
2) unless you need to append duplicate labels, you know that when
I and J are swapped, you''ll reach the same condition again, so it
might be worth writing the outer loops to eliminate this
scenario, and in the process, but just starting at i+1 rather
than i, you can forgo the check if "i<>j".
Such changes might look something like
for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN[i].coordinates == IN[j].coordinates:
SN.append(IN[i].label)
If my college algorithms memory serves me sufficiently, this
reduces your O(N^2) to O(N log N) which will garner you some
decent time savings.
-tkc
这篇关于改善了巨大的双重周期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!