在Python中创建类的开销:使用类的完全相同的代码是本机DS的两倍? [英] Overhead of creating classes in Python: Exact same code using class twice as slow as native DS?

查看:108
本文介绍了在Python中创建类的开销:使用类的完全相同的代码是本机DS的两倍?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用所有列表函数在Python中创建了Stack类作为练习。例如,Stack.push()只是list.append(),Stack.pop()就是list.pop(),而Stack.isEmpty()只是list == []。

I created a Stack class as an exercise in Python, using all list functions. For example, Stack.push() is just list.append(), Stack.pop() is list.pop() and Stack.isEmpty() is just list == [ ].

我正在使用我的Stack类实现十进制到二进制转换器,我注意到的是,即使这两个函数在我的Stack类包装push(),pop()和isEmpty(),使用Stack类的实现是使用Python列表的实现的两倍。

I was using my Stack class to implement a decimal to binary converter, and what I noticed is that even though the two functions are completely equivalent beyond the wrapping of my Stack class for push(), pop() and isEmpty(), the implementation using the Stack class is twice as slow as the implementation using Python's list.

是因为在Python中使用类总是存在固有的开销?如果是这样,那么从技术上讲(引擎盖下)开销是从哪里来的?最后,如果开销如此巨大,除非绝对必要,否则最好不要使用类吗?

Is that because there's always an inherent overhead to using classes in Python? And if so, where does the overhead come from technically speaking ("under the hood")? Finally, if the overhead is so significant, isn't it better not to use classes unless you absolutely have to?

def dectobin1(num):
    s = Stack()
    while num > 0:
        s.push(num % 2)
        num = num // 2
    binnum = ''
    while not s.isEmpty():
        binnum = binnum + str(s.pop())
    return binnum

def dectobin2(num):
    l = []
    while num > 0:
        l.append(num % 2)
        num = num // 2
    binnum = ''
    while not l == []:
        binnum = binnum + str(l.pop())
    return binnum


t1 = Timer('dectobin1(255)', 'from __main__ import dectobin1')
print(t1.timeit(number = 1000))

0.0211110115051

t2 = Timer('dectobin2(255)', 'from __main__ import dectobin2')
print(t2.timeit(number = 1000))

0.0094211101532


推荐答案

首先,一个警告:函数调用很少会限制您的速度。这通常是不必要的微优化。如果这实际上限制了您的表现,则只能这样做。

First off, a warning: Function calls are rarely what limits you in speed. This is often an unnecessary micro-optimisation. Only do that, if it is what actually limits your performance. Do some good profiling before and have a look if there might be a better way to optimise.

请确保您不会因为这种微小的性能调整而牺牲可读性! >

Python中的类有点破烂。

Classes in Python are a little bit of a hack.

它的工作方式是每个对象都有一个 __dict __ 字段(字典),包含对象包含的所有属性。同样,每个对象都有一个 __ class __ 对象,该对象又包含一个 __ dict __ 字段(同样是字典),该字段包含所有类属性。

The way it works is that each object has a __dict__ field (a dict) which contains all attributes the object contains. Also each object has a __class__ object which again contains a __dict__ field (again a dict) which contains all class attributes.

例如,看看这个:

>>> class X(): # I know this is an old-style class declaration, but this causes far less clutter for this demonstration
...     def y(self):
...             pass
...
>>> x = X()
>>> x.__class__.__dict__
{'y': <function y at 0x6ffffe29938>, '__module__': '__main__', '__doc__': None}

如果动态定义一个函数(因此不在类声明中,而是在创建对象之后),则该函数不会进入 x .__ class __.__ dict __ 而不是 x .__ dict __

If you define a function dynamically (so not in the class declaration but after the object creation) the function does not go to the x.__class__.__dict__ but instead to x.__dict__.

另外还有两个dict包含可从当前函数访问的所有变量。有 globals() locals(),其中包括所有全局变量和局部变量。

Also there are two dicts that hold all variables accessible from the current function. There is globals() and locals() which include all global and local variables.

所以现在让我们说,您有一个对象 x 的类是 X ,其函数是 y 和 z 和第二个函数 z ,这是动态定义的。假设对象 x 是在全局空间中定义的。
另外,为了进行比较,有两个函数 flocal()在本地空间中定义,而 fglobal(),它是在全局空间中定义的。

So now let's say, you have an object x of class X with functions y and z that were declared in the class declaration and a second function z, which was defined dynamically. Let's say object x is defined in global space. Also, for comparison, there are two functions flocal(), which was defined in local space and fglobal(), which was defined in global space.

现在,我将演示如果调用以下每个函数会发生什么情况:

Now I will show what happens if you call each of these functions:

flocal():
    locals()["flocal"]()

fglobal():
    locals()["fglobal"] -> not found
    globals()["fglobal"]()

x.y():
    locals()["x"] -> not found
    globals()["x"].__dict__["y"] -> not found, because y is in class space
                  .__class__.__dict__["y"]()

x.z():
    locals()["x"] -> not found
    globals()["x"].__dict__["z"]() -> found in object dict, ignoring z() in class space

中的z()因此,如您所见,类空间方法花费了更多时间查找时,对象空间方法也很慢。最快的选项是本地函数。

So as you see, class space methods take a lot more time to lookup, object space methods are slow as well. The fastest option is a local function.

但是您可以在不牺牲类的情况下解决该问题。可以说,xy()被调用很多,需要进行优化。

But you can get around that without sacrificing classes. Lets say, x.y() is called quite a lot and needs to be optimised.

class X():
    def y(self):
        pass

x = X()
for i in range(100000):
    x.y() # slow

y = x.y # move the function lookup outside of loop
for i in range(100000):
    y() # faster

类似的事情发生在对象的成员变量上。它们也比局部变量慢。如果您调用函数或使用对象变量中的成员变量(该对象是另一个对象的成员变量),则该效果也会累加起来。例如,

Similar things happen with member variables of objects. They are also slower than local variables. The effect also adds up, if you call a function or use a member variable that is in an object that is a member variable of a different object. So for example

a.b.c.d.e.f()

会稍微慢一些,因为每个点都需要另一个字典查找。

would be a fair bit slower as each dot needs another dictionary lookup.

建议使用Python的官方性能指南,以避免在代码的性能关键部分中出现点:$ b​​ $ b https://wiki.python.org/moin/PythonSpeed/PerformanceTips

An official Python performance guide reccomends to avoid dots in performance critical parts of the code: https://wiki.python.org/moin/PythonSpeed/PerformanceTips

这篇关于在Python中创建类的开销:使用类的完全相同的代码是本机DS的两倍?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆