如何从字符串列表中检索部分匹配 [英] How to retrieve partial matches from a list of strings
问题描述
有关在 数字 列表中检索部分匹配项的方法,请访问:
For approaches to retrieving partial matches in a numeric list, go to:
但是,如果您正在寻找如何检索 字符串 列表的部分匹配项,您会发现以下答案中简明地解释了最佳方法.
But if you're looking for how to retrieve partial matches for a list of strings, you'll find the best approaches concisely explained in the answer below.
SO:具有部分匹配项的Python列表查找显示了如何返回 bool
,如果 list
包含部分匹配的元素(例如 begins
, ends
或包含
)一个特定的字符串.但是如何 返回元素本身 ,而不是 True
或 False
SO: Python list lookup with partial match shows how to return a bool
, if a list
contains an element that partially matches (e.g. begins
, ends
, or contains
) a certain string. But how can you return the element itself, instead of True
or False
l = ['ones', 'twos', 'threes']
wanted = 'three'
此处,链接问题中的方法将使用以下方式返回 True
:
Here, the approach in the linked question will return True
using:
any(s.startswith(wanted) for s in l)
那么如何返回元素'threes'
?
推荐答案
-
startswith
和in
中,返回布尔值 in
运算符是对成员资格的测试.- 这可以通过
list-comprehension
或filter
来执行 - 使用带有
in
的list-comprehension
是最快的测试实现. - 如果大小写不是问题,请考虑将所有单词映射为小写.
-
l = list(map(str.lower,l))
. startswith
andin
, return a Boolean- The
in
operator is a test of membership. - This can be performed with a
list-comprehension
orfilter
- Using a
list-comprehension
, within
, is the fastest implementation tested. - If case is not an issue, consider mapping all the words to lowercase.
l = list(map(str.lower, l))
.- 使用
filter
创建一个filter
对象,因此list()
用于显示list
中的所有匹配值. - Using
filter
creates afilter
object, solist()
is used to show all the matching values in alist
.
l = ['ones', 'twos', 'threes'] wanted = 'three' # using startswith result = list(filter(lambda x: x.startswith(wanted), l)) # using in result = list(filter(lambda x: wanted in x, l)) print(result) [out]: ['threes']
列表理解
l = ['ones', 'twos', 'threes'] wanted = 'three' # using startswith result = [v for v in l if v.startswith(wanted)] # using in result = [v for v in l if wanted in v] print(result) [out]: ['threes']
哪种实施速度更快?
- 使用
nltk
中的 - 带有
'three'
的单词-
[三",三折",三折",三折",三折",三折",三折",三折","Theepence","theepenny","threepennyworth","threescore","threesome"]
- Using the
words
corpus fromnltk
- Words with
'three'
['three', 'threefold', 'threefolded', 'threefoldedness', 'threefoldly', 'threefoldness', 'threeling', 'threeness', 'threepence', 'threepenny', 'threepennyworth', 'threescore', 'threesome']
from nltk.corpus import words %timeit list(filter(lambda x: x.startswith(wanted), words.words())) [out]: 47.4 ms ± 1.9 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit list(filter(lambda x: wanted in x, words.words())) [out]: 27 ms ± 1.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit [v for v in words.words() if v.startswith(wanted)] [out]: 34.1 ms ± 768 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit [v for v in words.words() if wanted in v] [out]: 14.5 ms ± 63.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
这篇关于如何从字符串列表中检索部分匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
Which implementation is faster?
-
words
语料库
-