Python在整个列表中查找字符串的公共部分并将其从每个项目中删除 [英] Python finding the common parts of a string throughout a list and removing it from every item
问题描述
我有一个类似于以下文件目录的列表:
I have a list of file directories that looks similar to this:
path/new/stuff/files/morefiles/A/file2.txt
path/new/stuff/files/morefiles/B/file7.txt
path/new/stuff/files/morefiles/A/file1.txt
path/new/stuff/files/morefiles/C/file5.txt
我试图从每个列表中删除相同的路径的开头,然后从每个文件中删除.
I am trying to remove the beginnings of the paths that are the same from every list, and then deleting that from each file.
列表可以是任意长度,在示例中,我将尝试将列表更改为:
The list can be any length, and in the example I would be trying to change the list into:
A/file2.txt
B/file7.txt
A/file1.txt
C/file5.txt
可以使用诸如re.sub(r'.*I', 'I', filepath)
和filepath.split('_', 1)[-1]
之类的方法进行替换,但是我不确定如何在文件路径列表中找到公用部分
Methods like re.sub(r'.*I', 'I', filepath)
and filepath.split('_', 1)[-1]
can be used for the replacing, but I'm not sure about how to find the common parts in the list of filepaths
注意:
我正在使用Windows和python 3
I am using Windows and python 3
推荐答案
The first part of the answer is here: Python: Determine prefix from a set of (similar) strings
使用os.path.commonprefix()
查找字符串的最长公共部分(第一部分)
Use os.path.commonprefix()
to find the longest common (first part) of the string
用于选择与该答案相同的列表部分的代码是:
The code for selecting the part of the list that is the same as from that answer is:
# Return the longest prefix of all list elements.
def commonprefix(m):
"Given a list of pathnames, returns the longest common leading component"
if not m: return ''
s1 = min(m)
s2 = max(m)
for i, c in enumerate(s1):
if c != s2[i]:
return s1[:i]
return s1
现在您所要做的就是使用切片从列表中的每个项目中删除结果字符串
Now all you have to do is use slicing to remove the resulting string from each item in the list
结果是:
# Return the longest prefix of all list elements.
def commonprefix(m):
"Given a list of pathnames, returns the longest common leading component"
if not m: return ''
s1 = min(m)
s2 = max(m)
for i, c in enumerate(s1):
if c != s2[i]:
ans = s1[:i]
break
for each in range(len(m)):
m[each] = m[each].split(ans, 1)[-1]
return m
这篇关于Python在整个列表中查找字符串的公共部分并将其从每个项目中删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!