使用IPython%timeit以及使用纯Python的'{0}'。format()比str()和'{}'。format()更快 [英] '{0}'.format() is faster than str() and '{}'.format() using IPython %timeit and otherwise using pure Python
问题描述
这是CPython问题,不太确定它与其他实现是否具有相同的行为。
So it's a CPython thing, not quite sure that it has same behaviour with other implementations.
但是'{0}'。format()
比 str()快
和'{}'。format()
。我发布的是 Python 3.5.2 的结果,但是,我使用 Python 2.7.12 进行了尝试,趋势是相同的。
But '{0}'.format()
is faster than str()
and '{}'.format()
. I'm posting results from Python 3.5.2, but, I tried it with Python 2.7.12 and the trend is the same.
%timeit q=['{0}'.format(i) for i in range(100, 100000, 100)]
%timeit q=[str(i) for i in range(100, 100000, 100)]
%timeit q=['{}'.format(i) for i in range(100, 100000, 100)]
1000 loops, best of 3: 231 µs per loop
1000 loops, best of 3: 298 µs per loop
1000 loops, best of 3: 434 µs per loop
来自文档放在 object .__ str __(self)
由
str(object)
和内置函数format()
调用print()
以计算对象的非正式或可很好打印的字符串表示形式。
Called by
str(object)
and the built-in functionsformat()
andprint()
to compute the "informal" or nicely printable string representation of an object.
因此, str()
和 format()
调用相同的对象。__str__ (自)
方法,但是速度的差异从何而来?
So, str()
and format()
call same object.__str__(self)
method, but where does that difference in speed come from?
UPDATE
如@StefanPochmann和@Leon在评论中指出,他们得到了不同的结果。我尝试使用 python -m timeit ...
运行它,它们是正确的,因为结果是:
UPDATE
as @StefanPochmann and @Leon noted in comments, they get different results. I tried to run it with python -m timeit "..."
and, they are right, because the results are:
$ python3 -m timeit "['{0}'.format(i) for i in range(100, 100000, 100)]"
1000 loops, best of 3: 441 usec per loop
$ python3 -m timeit "[str(i) for i in range(100, 100000, 100)]"
1000 loops, best of 3: 297 usec per loop
$ python3 -m timeit "['{}'.format(i) for i in range(100, 100000, 100)]"
1000 loops, best of 3: 420 usec per loop
所以IPython似乎在做些奇怪的事...
So it seems that IPython is doing something strange...
新问题:通过速度将对象转换为 str
的首选方法是什么?
NEW QUESTION: What is preferred way to convert an object to str
by speed?
推荐答案
IPython计时由于某种原因而关闭(尽管在不同单元格中使用较长格式的字符串进行测试时, 好一点)。也许在同一单元格中执行不正确,不是很清楚。
The IPython timing is just off for some reason (though, when tested with a longer format string in different cells, it behaved slightly better). Maybe executing in the same cells isn't right, don't really know.
无论哪种方式, {}
比 {pos}
快一点,而 {name}
都比 str
慢。
Either way, "{}"
is a bit faster than "{pos}"
which is faster than "{name}"
while they're all slower than str
.
str(val)
是将对象转换为 str
的最快方法;它会直接调用对象的 __ str __
(如果存在),并返回结果字符串。其他,例如 format
,(或 str.format
),由于额外的函数调用(对 format
本身);处理所有参数,解析格式字符串,然后 then 调用其 args
的 __ str __
。
str(val)
is the fastest way to transform an object to str
; it directly calls the objects' __str__
, if one exists, and returns the resulting string. Others, like format
, (or str.format
) include additional overhead due to an extra function call (to format
itself); handling any arguments, parsing the format string and then invoking the __str__
of their args
.
对于 str.format
方法 {}
使用自动编号;摘自文档中有关格式语法的一小部分:
For the str.format
methods "{}"
uses automatic numbering; from a small section in the docs on the format syntax:
在版本3.1中进行了更改:可以省略位置参数说明符,因此
'{} {}'
等效于'{0} {1}'
。
也就是说,如果您提供以下形式的字符串:
that is, if you supply a string of the form:
"{}{}{}".format(1, 2, 3)
CPython 立即知道这是等效于:
CPython immediately knows that this is equivalent to:
"{0}{1}{2}".format(1, 2, 3)
带有格式字符串,该字符串包含表示位置的数字; CPython不能假设一个严格增加的数字(从 0
开始),必须解析每个括号以使其正确,从而使过程变慢了一些:
With a format string that contains numbers indicating positions; CPython can't assume a strictly increasing number (that starts from 0
) and must parse every single bracket in order to get it right, slowing things down a bit in the process:
"{1}{2}{0}".format(1, 2, 3)
这就是为什么也不允许将两者混合使用的原因:
That's why it also is not allowed to mix these two together:
"{1}{}{2}".format(1, 2, 3)
当您尝试这样做时,您会得到一个不错的 ValueError
返回:
you'll get a nice ValueError
back when you attempt to do so:
ValueError: cannot switch from automatic field numbering to manual field specification
它也使用我敢肯定> PySequence_GetItem
至少比 PyObject_GetItem
快[请参阅下一个]。
it also grabs these positionals with PySequence_GetItem
which I'm pretty sure is fast, at least, in comparison to PyObject_GetItem
[see next].
对于 {name}
值,由于我们处理的是关键字参数而不是位置参数,因此CPython总是要做很多工作。这包括为调用建立字典以及生成更多 LOAD
字节码指令以加载 key
的方式,以及价值观。函数调用的关键字形式始终会带来一些开销。另外,似乎抢夺实际上使用了 PyObject_GetItem
,由于其通用性质,因此会产生一些额外的开销。
For "{name}"
values, CPython always has extra work to do due to the fact that we're dealing with keyword arguments rather than positional ones; this includes things like building the dictionary for the calls and generating way more LOAD
byte-code instructions for loading key
s and values. The keyword form of function calling always introduces some overhead. In addition, it seems that the grabbing actually uses PyObject_GetItem
which incurs some extra overhead due to its generic nature.
这篇关于使用IPython%timeit以及使用纯Python的'{0}'。format()比str()和'{}'。format()更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!