有效的方式来复制在Python阵列/列表 [英] Efficient ways to duplicate array/list in Python

查看:164
本文介绍了有效的方式来复制在Python阵列/列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请注意:我是一个Ruby开发者试图找到我在Python的方式

当我想弄清楚为什么有些脚本中使用 MYLIST [:] 而不是列表(MYLIST)来重复的名单,我提出的各种方法快速基准复制范围(10)(见code以下)。

编辑:我更新了测试,使用Python的 timeit 的如下建议。这使得它不可能直接把它比作红宝石,因为的 timeit 的不占的循环,而Ruby的基准做,因此Ruby code是对的仅供参考

的Python 2.7.2

 复制阵列。测试运行5000次
列表(一)18.7599430084
复印件(一)59.1787488461
一个[:] 9.58828091621
一个[0:LEN(一)] 14.9832749367

有关参考,在我写的红宝石同一个脚本太:

红宝石1.9.2p0

 复制阵列。测试5000次
                      用户系统实际总
Array.new(一)14.590000 0.030000 14.620000(14.693033)
数组[* A] 18.840000 0.060000 18.900000(19.156352)
a.take(a.size)8.780000 0.020000 8.800000(8.805700)
a.clone 16.310000 0.040000 16.350000(16.384711)
一个[0,a.size] 8.950000 0.020000 8.970000(8.990514)

问1:什么是 MYLIST [:] 操作不同,它的 25%的,甚至比快 MYLIST [0:LEN(MYLIST)] 。它在内存中直接拷贝还是什么?

问题2: 编辑:的更新的基准不显示Python和Ruby的巨大差异了。 是:难道我实施一些明显低效的方式测试,让红宝石code是如此之快比Python

现在的code列表:

的Python:

 进口timeitCOUNT = 5000打印阵列翻炒。测试运行,伯爵,时代安装='一个=范围(10);进口复制打印名单(一)\\ t \\ t的,timeit.timeit(语句=名单(一),设置=设置,数= COUNT)
打印复制(一)\\ t \\ t的,timeit.timeit(语句='copy.copy(一),设置=设置,数= COUNT)
打印一个[:] \\ t \\ t的,timeit.timeit(语句='A [:]',设置=设置,数= COUNT)
打印一个[0:LEN(A)] \\ t的,timeit.timeit(语句='A [0:LEN(一),设置=设置,数= COUNT)

红宝石:

 要求'标杆'一个=(0 ... 10).to_aCOUNT = 50_000_000把阵列翻炒。测试#{COUNT}次Benchmark.bm(16)做| X |
  x.report(Array.new(一)){COUNT.times {Array.new(一)}}
  x.report(数组[* A]){{COUNT.times数组[* A]}}
  x.report(a.take(a.size)){COUNT.times {a.take(a.size)}}
  x.report(a.clone){{COUNT.times a.clone}}
  x.report(一个[0,a.size]){COUNT.times {一个[0,a.size]}}
结束


解决方案

使用 timeit 在Python模块进行测试时序。

 从副本导入*一个=范围(1000)高清COP():
    B =复制(一)高清func1的():
    B =名单(一)高清片():
    B = A [:]高清slice_len():
    B = A [0:LEN(一)]如果__name __ ==__ main__:
    进口timeit
    打印复制(一),timeit.timeit(COP(),设置=从__main__进口警察)
    打印名单(一),timeit.timeit(func1的(),设置=从__main__进口FUNC1)
    打印一个[:],timeit.timeit(片(),设置=从__main__进口片)
    打印一个[0:LEN(一),timeit.timeit(从__main__进口slice_len,slice_len(),设置=)

结果:

 拷贝(一)3.98940896988
列表(一)2.54​​542589188
一个[:] 1.96630120277 #winner
一个[0:LEN(一)] 10.5431251526

这是肯定参与了额外的步骤 A。[0:LEN(A)] 是它缓慢的原因

下面是两个字节code对比:

 在[19]:dis.dis(FUNC1)
  2 0 LOAD_GLOBAL 0(范围)
              3 LOAD_CONST 1(100000)
              6 CALL_FUNCTION 1
              9 STORE_FAST 0(一)  3月12日LOAD_FAST 0(一)
             15 SLICE + 0
             16 STORE_FAST图1(b)
             19 LOAD_CONST 0(无)
             22 RETURN_VALUE在[20]:dis.dis(FUNC2)
  2 0 LOAD_GLOBAL 0(范围)
              3 LOAD_CONST 1(100000)
              6 CALL_FUNCTION 1
              9 STORE_FAST 0(一)  3月12日LOAD_FAST 0(一)#same了这里
             15 LOAD_CONST 2(0)#loads 0
             18 LOAD_GLOBAL 1(LEN)#负载内置的LEN()
                                                 #所以它可能需要一些时间的查找
             21 LOAD_FAST 0(一)
             24 CALL_FUNCTION 1
             27 SLICE + 3
             28 STORE_FAST图1(b)
             31 LOAD_CONST 0(无)
             34 RETURN_VALUE

Note: I'm a Ruby developer trying to find my way in Python.

When I wanted to figure out why some scripts use mylist[:] instead of list(mylist) to duplicate lists, I made a quick benchmark of the various methods to duplicate range(10) (see code below).

EDIT: I updated the tests to make use of Python's timeit as suggested below. This makes it impossible to directly compare it to Ruby, because timeit doesn't account for the looping while Ruby's Benchmark does, so Ruby code is for reference only.

Python 2.7.2

Array duplicating. Tests run 50000000 times
list(a)     18.7599430084
copy(a)     59.1787488461
a[:]         9.58828091621
a[0:len(a)] 14.9832749367

For reference, I wrote the same script in Ruby too:

Ruby 1.9.2p0

Array duplicating. Tests 50000000 times
                      user     system      total        real
Array.new(a)     14.590000   0.030000  14.620000 ( 14.693033)
Array[*a]        18.840000   0.060000  18.900000 ( 19.156352)
a.take(a.size)    8.780000   0.020000   8.800000 (  8.805700)
a.clone          16.310000   0.040000  16.350000 ( 16.384711)
a[0,a.size]       8.950000   0.020000   8.970000 (  8.990514)

Question 1: what is mylist[:] doing differently that it is 25 % faster than even mylist[0:len(mylist)]. Does it copy in memory directly or what?

Question 2: edit: updated benchmarks don't show huge differences in Python and Ruby anymore. was: Did I implement the tests in some obviously inefficient way, so that Ruby code is so much faster than Python?

Now the code listings:

Python:

import timeit

COUNT = 50000000

print "Array duplicating. Tests run", COUNT, "times"

setup = 'a = range(10); import copy'

print "list(a)\t\t", timeit.timeit(stmt='list(a)', setup=setup, number=COUNT)
print "copy(a)\t\t", timeit.timeit(stmt='copy.copy(a)', setup=setup, number=COUNT)
print "a[:]\t\t", timeit.timeit(stmt='a[:]', setup=setup, number=COUNT)
print "a[0:len(a)]\t", timeit.timeit(stmt='a[0:len(a)]', setup=setup, number=COUNT)

Ruby:

require 'benchmark'

a = (0...10).to_a

COUNT = 50_000_000

puts "Array duplicating. Tests #{COUNT} times"

Benchmark.bm(16) do |x|
  x.report("Array.new(a)")   {COUNT.times{ Array.new(a) }}
  x.report("Array[*a]")   {COUNT.times{ Array[*a] }}
  x.report("a.take(a.size)")   {COUNT.times{ a.take(a.size) }}
  x.report("a.clone")    {COUNT.times{ a.clone }}
  x.report("a[0,a.size]"){COUNT.times{ a[0,a.size] }}
end

解决方案

Use the timeit module in python for testing timings.

from copy import *

a=range(1000)

def cop():
    b=copy(a)

def func1():
    b=list(a)

def slice():
    b=a[:]

def slice_len():
    b=a[0:len(a)]



if __name__=="__main__":
    import timeit
    print "copy(a)",timeit.timeit("cop()", setup="from __main__ import cop")
    print "list(a)",timeit.timeit("func1()", setup="from __main__ import func1")
    print "a[:]",timeit.timeit("slice()", setup="from __main__ import slice")
    print "a[0:len(a)]",timeit.timeit("slice_len()", setup="from __main__ import slice_len")

Results:

copy(a) 3.98940896988
list(a) 2.54542589188
a[:] 1.96630120277                   #winner
a[0:len(a)] 10.5431251526

It's surely the extra steps involved in a[0:len(a)] are the reason for it's slowness.

Here's the byte code comparison of the two:

In [19]: dis.dis(func1)
  2           0 LOAD_GLOBAL              0 (range)
              3 LOAD_CONST               1 (100000)
              6 CALL_FUNCTION            1
              9 STORE_FAST               0 (a)

  3          12 LOAD_FAST                0 (a)
             15 SLICE+0             
             16 STORE_FAST               1 (b)
             19 LOAD_CONST               0 (None)
             22 RETURN_VALUE        

In [20]: dis.dis(func2)
  2           0 LOAD_GLOBAL              0 (range)
              3 LOAD_CONST               1 (100000)
              6 CALL_FUNCTION            1
              9 STORE_FAST               0 (a)

  3          12 LOAD_FAST                0 (a)    #same up to here
             15 LOAD_CONST               2 (0)    #loads 0
             18 LOAD_GLOBAL              1 (len) # loads the builtin len(),
                                                 # so it might take some lookup time
             21 LOAD_FAST                0 (a)
             24 CALL_FUNCTION            1         
             27 SLICE+3             
             28 STORE_FAST               1 (b)
             31 LOAD_CONST               0 (None)
             34 RETURN_VALUE        

这篇关于有效的方式来复制在Python阵列/列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆