检查字符串是否以 Python 中的几个子字符串之一开头 [英] Check if string begins with one of several substrings in Python

查看:44
本文介绍了检查字符串是否以 Python 中的几个子字符串之一开头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不知道如何为一组子字符串执行 line.startswith("substring"),所以我在底部的代码上尝试了一些变体:因为我有已知的 4 字符开头子字符串的奢侈,但我很确定我的语法错误,因为这不会拒绝任何行.

I couldn't figure out how to perform line.startswith("substring") for a set of substrings, so I tried a few variations on the code at bottom: since I have the luxury of known 4-character beginning substrings, but I'm pretty sure I've got the syntax wrong, since this doesn't reject any lines.

(上下文:我的目标是在读取文件时丢弃标题行.标题行以一组有限的字符串开头,但我不能只在任何地方检查子字符串,因为有效的内容行可能包括关键字后面的字符串.)

(Context: my aim is to throw out header lines when reading in a file. Header lines start with a limited set of strings, but I can't just check for the substring anywhere, because a valid content line may include a keyword later in the string.)

cleanLines = []
line = "sample input here"
if not line[0:3] in ["node", "path", "Path"]:  #skip standard headers
    cleanLines.append(line)

推荐答案

你的问题源于字符串切片不包括停止索引:

Your problem stems from the fact that string slicing is exclusive of the stop index:

In [7]: line = '0123456789'

In [8]: line[0:3]
Out[8]: '012'

In [9]: line[0:4]
Out[9]: '0123'

In [10]: line[:3]
Out[10]: '012'

In [11]: line[:4]
Out[11]: '0123'

ij 之间切分字符串返回从 i 开始,到(但不包括)j 结束的子字符串.

Slicing a string between i and j returns the substring starting at i, and ending at (but not including) j.

为了让你的代码运行得更快,你可能想要测试集合中的成员资格,而不是列表:

Just to make your code run faster, you might want to test membership in sets, instead of in lists:

cleanLines = []
line = "sample input here"
blacklist = set(["node", "path", "Path"])
if line[:4] not in blacklist:  #skip standard headers
    cleanLines.append(line)

现在,您实际使用该代码做的是一个 startswith,它不受任何长度参数的限制:

Now, what you're actually doing with that code is a startswith, which is not restricted by any length parameters:

In [12]: line = '0123456789'

In [13]: line.startswith('0')
Out[13]: True

In [14]: line.startswith('0123')
Out[14]: True

In [15]: line.startswith('03')
Out[15]: False

所以你可以这样做来排除标题:

So you could do this to exclude headers:

cleanLines = []
line = "sample input here"
headers = ["node", "path", "Path"]
if not any(line.startswith(header) for header in headers) :  #skip standard headers
    cleanLines.append(line)

这篇关于检查字符串是否以 Python 中的几个子字符串之一开头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆