使用IPython%timeit以及使用纯Python的'{0}'。format()比str()和'{}'。format()更快 [英] '{0}'.format() is faster than str() and '{}'.format() using IPython %timeit and otherwise using pure Python

查看:136
本文介绍了使用IPython%timeit以及使用纯Python的'{0}'。format()比str()和'{}'。format()更快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是CPython问题,不太确定它与其他实现是否具有相同的行为。

So it's a CPython thing, not quite sure that it has same behaviour with other implementations.

但是'{0}'。format() str()快'{}'。format()。我发布的是 Python 3.5.2 的结果,但是,我使用 Python 2.7.12 进行了尝试,趋势是相同的。

But '{0}'.format() is faster than str() and '{}'.format(). I'm posting results from Python 3.5.2, but, I tried it with Python 2.7.12 and the trend is the same.

%timeit q=['{0}'.format(i) for i in range(100, 100000, 100)]
%timeit q=[str(i) for i in range(100, 100000, 100)]
%timeit q=['{}'.format(i) for i in range(100, 100000, 100)]

1000 loops, best of 3: 231 µs per loop
1000 loops, best of 3: 298 µs per loop
1000 loops, best of 3: 434 µs per loop

来自文档放在 object .__ str __(self)


str(object)和内置函数 format()调用 print()以计算对象的非正式或可很好打印的字符串表示形式。

Called by str(object) and the built-in functions format() and print() to compute the "informal" or nicely printable string representation of an object.

因此, str() format()调用相同的对象。__str__ (自) 方法,但是速度的差异从何而来?

So, str() and format() call same object.__str__(self) method, but where does that difference in speed come from?

UPDATE
如@StefanPochmann和@Leon在评论中指出,他们得到了不同的结果。我尝试使用 python -m timeit ... 运行它,它们是正确的,因为结果是:

UPDATE as @StefanPochmann and @Leon noted in comments, they get different results. I tried to run it with python -m timeit "..." and, they are right, because the results are:

$ python3 -m timeit "['{0}'.format(i) for i in range(100, 100000, 100)]"
1000 loops, best of 3: 441 usec per loop

$ python3 -m timeit "[str(i) for i in range(100, 100000, 100)]"
1000 loops, best of 3: 297 usec per loop

$ python3 -m timeit "['{}'.format(i) for i in range(100, 100000, 100)]"
1000 loops, best of 3: 420 usec per loop

所以IPython似乎在做些奇怪的事...

So it seems that IPython is doing something strange...

新问题:通过速度将对象转换为 str 的首选方法是什么?

NEW QUESTION: What is preferred way to convert an object to str by speed?

推荐答案

IPython计时由于某种原因而关闭(尽管在不同单元格中使用较长格式的字符串进行测试时, 好一点)。也许在同一单元格中执行不正确,不是很清楚。

The IPython timing is just off for some reason (though, when tested with a longer format string in different cells, it behaved slightly better). Maybe executing in the same cells isn't right, don't really know.

无论哪种方式, {} {pos} 快一点,而 {name} 都比 str 慢。

Either way, "{}" is a bit faster than "{pos}" which is faster than "{name}" while they're all slower than str.

str(val)是将对象转换为 str 的最快方法;它会直接调用对象的 __ str __ (如果存在),并返回结果字符串。其他,例如 format ,(或 str.format ),由于额外的函数调用(对 format 本身);处理所有参数,解析格式字符串,然后 then 调用其 args __ str __

str(val) is the fastest way to transform an object to str; it directly calls the objects' __str__, if one exists, and returns the resulting string. Others, like format, (or str.format) include additional overhead due to an extra function call (to format itself); handling any arguments, parsing the format string and then invoking the __str__ of their args.

对于 str.format 方法 {} 使用自动编号;摘自文档中有关格式语法的一小部分

For the str.format methods "{}" uses automatic numbering; from a small section in the docs on the format syntax:


在版本3.1中进行了更改:可以省略位置参数说明符,因此'{} {}' 等效于'{0} {1}'

也就是说,如果您提供以下形式的字符串:

that is, if you supply a string of the form:

"{}{}{}".format(1, 2, 3)

CPython 立即知道这是等效于:

CPython immediately knows that this is equivalent to:

"{0}{1}{2}".format(1, 2, 3)

带有格式字符串,该字符串包含表示位置的数字; CPython不能假设一个严格增加的数字(从 0 开始),必须解析每个括号以使其正确,从而使过程变慢了一些:

With a format string that contains numbers indicating positions; CPython can't assume a strictly increasing number (that starts from 0) and must parse every single bracket in order to get it right, slowing things down a bit in the process:

"{1}{2}{0}".format(1, 2, 3)

这就是为什么也不允许将两者混合使用的原因:

That's why it also is not allowed to mix these two together:

"{1}{}{2}".format(1, 2, 3)

当您尝试这样做时,您会得到一个不错的 ValueError 返回:

you'll get a nice ValueError back when you attempt to do so:

ValueError: cannot switch from automatic field numbering to manual field specification

它也使用我敢肯定> PySequence_GetItem 至少比 PyObject_GetItem 快[请参阅下一个]。

it also grabs these positionals with PySequence_GetItem which I'm pretty sure is fast, at least, in comparison to PyObject_GetItem [see next].

对于 {name} 值,由于我们处理的是关键字参数而不是位置参数,因此CPython总是要做很多工作。这包括为调用建立字典以及生成更多 LOAD 字节码指令以加载 key 的方式,以及价值观。函数调用的关键字形式始终会带来一些开销。另外,似乎抢夺实际上使用了 PyObject_GetItem ,由于其通用性质,因此会产生一些额外的开销。

For "{name}" values, CPython always has extra work to do due to the fact that we're dealing with keyword arguments rather than positional ones; this includes things like building the dictionary for the calls and generating way more LOAD byte-code instructions for loading keys and values. The keyword form of function calling always introduces some overhead. In addition, it seems that the grabbing actually uses PyObject_GetItem which incurs some extra overhead due to its generic nature.

这篇关于使用IPython%timeit以及使用纯Python的'{0}'。format()比str()和'{}'。format()更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆