str.startswith如何真正起作用? [英] How does str.startswith really work?
问题描述
我已经和startswith()
玩了一段时间,发现了一些有趣的东西:
I've been playing for a bit with startswith()
and I've discovered something interesting:
>>> tup = ('1', '2', '3')
>>> lis = ['1', '2', '3', '4']
>>> '1'.startswith(tup)
True
>>> '1'.startswith(lis)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: startswith first arg must be str or a tuple of str, not list
现在,错误很明显,将列表转换为元组将像刚开始时一样工作正常:
Now, the error is obvious and casting the list into a tuple will work just fine as it did in the first place:
>>> '1'.startswith(tuple(lis))
True
现在,我的问题是:为什么第一个参数必须是str或str前缀的元组,而不是str前缀的列表?
Now, my question is: why the first argument must be str or a tuple of str prefixes, but not a list of str prefixes?
AFAIK,startswith()
的Python代码可能看起来像这样:
AFAIK, the Python code for startswith()
might look like this:
def startswith(src, prefix):
return src[:len(prefix)] == prefix
但是那让我更加困惑,因为即使记住它,列表还是元组也应该没有任何区别.我想念什么?
But that just confuses me more, because even with it in mind, it still shouldn't make any difference whether is a list or tuple. What am I missing ?
推荐答案
从技术上讲,没有理由不接受其他序列类型. 源代码大致做到了这一点:
There is technically no reason to accept other sequence types, no. The source code roughly does this:
if isinstance(prefix, tuple):
for substring in prefix:
if not isinstance(substring, str):
raise TypeError(...)
return tailmatch(...)
elif not isinstance(prefix, str):
raise TypeError(...)
return tailmatch(...)
(其中 tailmatch(...)
进行实际匹配).
(where tailmatch(...)
does the actual matching work).
所以是的,任何可迭代项都可以用于该for
循环.但是,所有其他使用多个值的字符串测试API(以及isinstance()
和issubclass()
)也只接受元组,这告诉您作为API的用户可以安全地假定值不会被突变.您不能对元组进行变异,但是从理论上讲,该方法可以对列表进行变异.
So yes, any iterable would do for that for
loop. But, all the other string test APIs (as well as isinstance()
and issubclass()
) that take multiple values also only accept tuples, and this tells you as a user of the API that it is safe to assume that the value won't be mutated. You can't mutate a tuple but the method could in theory mutate the list.
还要注意,您通常测试固定数目的前缀或后缀或类(对于isinstance()
和issubclass()
而言);该实现不适合大个元素.元组表示您的元素数量有限,而列表可以任意大.
Also note that you usually test for a fixed number of prefixes or suffixes or classes (in the case of isinstance()
and issubclass()
); the implementation is not suited for a large number of elements. A tuple implies that you have a limited number of elements, while lists can be arbitrarily large.
接下来,如果可以接受任何可迭代或序列类型,则将包括字符串;单个字符串也是 序列.那么应该将单个字符串参数视为单独的字符还是作为单个前缀?
Next, if any iterable or sequence type would be acceptable, then that would include strings; a single string is also a sequence. Should then a single string argument be treated as separate characters, or as a single prefix?
因此,换句话说,序列不会被突变,与其他API一致,限制了要测试的项目数量以及消除了对方法的歧义是对自我文档的限制.一个字符串参数应该被对待.
So in other words, it's a limitation to self-document that the sequence won't be mutated, is consistent with other APIs, it carries an implication of a limited number of items to test against, and removes ambiguity as to how a single string argument should be treated.
请注意,这是在"Python想法"列表中提到的;参见此线程; Guido van Rossum的主要论点是,您要么是单个字符串的特殊情况,要么是仅接受一个元组.他选择了后者,因此不需要更改它.
Note that this was brought up before on the Python Ideas list; see this thread; Guido van Rossum's main argument there is that you either special case for single strings or for only accepting a tuple. He picked the latter and doesn't see a need to change this.
这篇关于str.startswith如何真正起作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!