从单词表中查找字符串和替换 [英] Find strings and subtring from the wordlist
本文介绍了从单词表中查找字符串和替换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有test.txt文件,从单词表中查找字符串和子字符串
i have test.txt file, Find strings and subtring from the wordlist
<aardwolf>
<Aargau>
<Aaronic>
<aac>
<akac>
<abaca>
<abactinal>
<abacus>
test.py文件
import sys # the sys module
import os
import re
def hasattr(str,list):
expr = re.compile(str)
# yield the elements
return [elem for elem in list if expr.match(elem)]
isword = {}
FH = open(sys.argv[1],'r',encoding="ISO-8859-1")
for strLine in FH.readlines(): isword.setdefault(''.join(sorted(strLine[1:strLine.find('>')].upper())),[]).append(strLine[:-1])
print (isword)
basestring=str()
for ARGV in sys.argv[2:]:
print ("\n*** %s\n" %ARGV )#print Argv
diffpatletters = re.compile(u'[a-zA-Z]').findall(ARGV.upper())
#print (diffpatletters)
diffpat = '.*' + '(.*)'.join(sorted(diffpatletters)) + '.*'
#print (diffpat)
for KEY in hasattr(diffpat,isword.keys()):
# print (KEY)
SUBKEY = KEY
for X in diffpatletters:
#print (X)
SUBKEY1 = SUBKEY.replace(X,'')
#print (SUBKEY)
if SUBKEY1 in isword:
#print (SUBKEY)
basestring+= "%s -> %s" %(isword[KEY], isword[SUBKEY1])
print (basestring + "\n")
下面是在命令行中运行文件
Below is to run the file in command line
python test.py test.txt aack aadfl
预计将在第二个参数之后找到匹配的字符串和子字符串.My basestring not printing
Expected out is find the matched the string and sub-string of each after second argument.My basestring not printing
推荐答案
您必须使用regexp吗? 如果没关系,您想要这样的结果吗?
have you had to use regexp? if it doesn't matter, do you want results like this?
with open('test.txt', 'r')as f:
s = f.read()
s = s.split('\n')
s
Out[1]:
['<aardwolf>',
'<Aargau>',
'<Aaronic>',
'<aac>',
'<akac>',
'<abaca>',
'<abactinal>',
'<abacus> ']
对于列表类型的结果:
ARGVs = ['aard', 'onic', 'abacu']
matches = [x for x in s for arg in ARGVs if arg.lower() in x.lower()]
print(matches)
Out[2]:
['<aardwolf>', '<Aaronic>', '<abacus> ']
对于字典类型的结果
ARGVs = ['aard', 'onic', 'abacu', 'aaro', 'ac']
{key:[x for x in s if key in x] for key in ARGVs if len([x for x in s if key in x]) != 0}
Out[3]:
{'aard': ['<aardwolf>'],
'onic': ['<Aaronic>'],
'abacu': ['<abacus> '],
'ac': ['<aac>', '<akac>', '<abaca>', '<abactinal>', '<abacus> ']}
使用RegExp
import re
with open('test.txt', 'r')as f:
s = f.read()
ARGVs = ['wol','ac']
cond = '|'.join([f'\w*{patt}\w*' for patt in ARGVs])
re.findall(cond,s)
Out[4]:
['aardwolf', 'aac', 'akac', 'abaca', 'abactinal', 'abacus']
这篇关于从单词表中查找字符串和替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文