Python:创建一个函数来通过引用而不是值来修改列表 [英] Python: create a function to modify a list by reference not value

查看:179
本文介绍了Python:创建一个函数来通过引用而不是值来修改列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做一些对性能至关重要的Python工作,并希望创建一个函数,如果满足某些条件,则该函数将从列表中删除一些元素.我宁愿不创建列表的任何副本,因为它充满了很多非常大的对象.

I'm doing some performance-critical Python work and want to create a function that removes a few elements from a list if they meet certain criteria. I'd rather not create any copies of the list because it's filled with a lot of really large objects.

我要实现的功能:

def listCleanup(listOfElements):
    i = 0
    for element in listOfElements:
        if(element.meetsCriteria()):
            del(listOfElements[i])
        i += 1
    return listOfElements

myList = range(10000)
myList = listCleanup(listOfElements)

我不熟悉Python的底层工作.是通过值还是通过引用传递myList?

I'm not familiar with the low-level workings of Python. Is myList being passed by value or by reference?

如何使它更快?

是否可以通过某种方式扩展列表类并在其中实现listCleanup()?

Is it possible to somehow extend the list class and implement listCleanup() within that?

myList = range(10000)
myList.listCleanup()

谢谢-

乔纳森

推荐答案

Python以相同的方式传递所有内容,但是按值"或按引用"称呼它不会清除所有内容,因为Python的语义不同于语言这些条款通常适用于此.如果要描述它,我会说所有传递都是按值进行的,并且该值是对象引用. (这就是为什么我不想说!)

Python passes everything the same way, but calling it "by value" or "by reference" will not clear everything up, since Python's semantics are different than the languages for which those terms usually apply. If I was to describe it, I would say that all passing was by value, and that the value was an object reference. (This is why I didn't want to say it!)

如果您想从列表中过滤掉某些内容,请建立一个新列表

If you want to filter out some stuff from a list, you build a new list

foo = range(100000)
new_foo = []
for item in foo:
    if item % 3 != 0: # Things divisble by 3 don't get through
        new_foo.append(item)

或者,使用列表理解语法

or, using the list comprehension syntax

 new_foo = [item for item in foo if item % 3 != 0]

Python不会复制列表中的对象,但是foonew_foo都将引用相同的对象. (Python绝不会隐式复制任何对象.)

Python will not copy the objects in the list, but rather both foo and new_foo will reference the same objects. (Python never implicitly copies any objects.)

您建议您对此操作有性能方面的顾虑.从旧列表中使用重复的del语句将不会导致代码不那么惯用且更难以处理,但是由于每次都必须重新整理整个列表,因此会带来二次性能.

You have suggested you have performance concerns about this operation. Using repeated del statements from the old list will result in not code that is less idiomatic and more confusing to deal with, but it will introduce quadratic performance because the whole list must be reshuffled each time.

要提高性能,请执行以下操作:

To address performance:

  • 启动并运行.您无法弄清楚性能如何,除非您有代码在工作.这还将告诉您必须优化的是速度还是空间.您在代码中提到了对两者的担忧,但是通常,优化需要以牺牲另一个为代价.

  • Get it up and running. You can't figure out what your performance is like unless you have code working. This will also tell you whether it is speed or space that you must optimize for; you mention concerns about both in your code, but oftentimes optimization involves getting one at the cost of the other.

配置文件..您可以使用 stdlib工具及时表现.有各种各样的第三方内存分析器可能会有些用,但使用起来却不太好.

Profile. You can use the stdlib tools for performance in time. There are various third-party memory profilers that can be somewhat useful but aren't quite as nice to work with.

测量. 时间或重新配置内存当您进行更改时,看看更改是否可以带来改进,如果可以,那么改进是什么.

Measure. Time or reprofile memory when you make a change to see if a change makes an improvement and if so what that improvement is.

为使代码对内存更敏感,通常需要在存储数据的方式上进行范式转换,而不是进行微优化,而不是不构建第二个列表来进行过滤. (时间上也是如此,实际上:更改为更好的算法几乎总是可以实现最佳的加速.但是,很难对速度优化进行概括.)

To make your code more memory-sensitive, you will often want a paradigm shift in how you store your data, not microoptimizastions like not building a second list to do filtering. (The same is true for time, really: changing to a better algorithm will almost always give the best speedup. However, it's harder to generalize about speed optimizations).

一些常见的范式转换可优化Python include中的内存消耗

Some common paradigm shifts to optimize memory consumption in Python include

  1. 使用生成器.生成器是懒惰的可迭代对象:它们不会立即将整个列表加载到内存中,而是会弄清楚下一步将要运行的项.要使用生成器,上面的摘录应类似于

  1. Using Generators. Generators are lazy iterables: they don't load a whole list into memory at once, they figure out what their next items are on the fly. To use generators, the snippets above would look like

foo = xrange(100000) # Like generators, xrange is lazy
def filter_divisible_by_three(iterable):
    for item in foo:
        if item % 3 != 0:
            yield item

new_foo = filter_divisible_by_three(foo)

或使用生成器表达式语法

or, using the generator expression syntax,

new_foo = (item for item in foo if item % 3 != 0)

  • numpy用于同质序列,尤其是数字数学的序列.这也可以加快执行大量矢量操作的代码.

  • Using numpy for homogenous sequences, especially ones that are numerical-mathy. This can also speed up code that does lots of vector operations.

    将数据存储到磁盘上,例如在数据库中.

    Storing data to disk, such as in a database.

  • 这篇关于Python:创建一个函数来通过引用而不是值来修改列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆