删除包含2个单词的引号并删除它们之间的逗号 [英] Remove quotes holding 2 words and remove comma between them
问题描述
扩展输入和预期输出:
尝试将第二行中的 Durango和PC 中的两个单词之间的逗号替换为& ,然后也删除引号。 Orbis和PC 和第四行的行中有两个单词组合,我要处理 AAA-Character Tech,SOF-UPI, Durango ,Orbis,PC
trying to replace comma between 2 words Durango and PC in the second line by & and then remove the quotes " as well. Same for third line with Orbis and PC and 4th line has 2 word combos in quotes that I would like to process "AAA - Character Tech, SOF - UPIs","Durango, Orbis, PC"
我想使用Python保留其余的行。
I would like to retain the rest of the lines using Python.
输入
2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering,"Durango, PC",55,Reopened
3,SIN-Audio,AAA - Audio,"Orbis, PC",13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,"AAA - Character Tech, SOF - UPIs","Durango, Orbis, PC",29,Waiting For
...
...
...
像这些一样,可以有100行在我的样本中。因此,预期输出为:
Like these, there can be 100 lines in my sample. So the expected output is:
2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering, Durango & PC,55,Reopened
3,SIN-Audio,AAA - Audio, Orbis & PC,13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA - Character Tech & SOF - UPIs,Durango, Orbis & PC,29,Waiting For
...
...
...
到目前为止,我可以考虑逐行阅读,然后如果该行包含引号,则将其替换为没有字符,但是替换里面的符号是我所坚持的。
So far, I could think of reading line by line and then if the line contains quote replace it with no character but then replacement of symbol inside is something I am stuck with.
这是我现在所拥有的:
for line in lines:
expr2 = re.findall('"(.*?)"', line)
if len(expr2)!=0:
expr3 = re.split('"',line)
expr4 = expr3[0]+expr3[1].replace(","," &")+expr3[2]
print >>k, expr4
else:
print >>k, line
,但不考虑第4行的情况吗?超过3个连击。例如,
but it does not consider the case in 4th line? There can be more than 3 combos as well. For eg.
3,SIN-Audio,"AAA - Audio, xxxx, yyyy","Orbis, PC","13, 22",Open
,并希望将其设为
3,SIN-Audio,AAA-Audio& xx xx& yyyy,Orbis& PC,13和22,打开
如何实现这一点,有什么建议吗?学习Python。
How to achieve this, any suggestion? Learning Python.
推荐答案
因此,通过将输入文件视为 .csv
我们可以轻松地将线条变成易于使用的东西。
So, by treating the input file as a .csv
we can easily turn the lines into something easy to work with.
例如,
2,Cenny Chong,核心技术-渲染,Durango& PC,55,重新打开
读取为:
['2','Kenny Chong','Core Tech-Rendering','Durango,PC','55','Reopened']
然后,用 _&
(空格)替换所有,
实例, :
Then, by replacing all instances of ,
with _&
(space) we would have the line:
['2','Kenny Chong','Core Tech-Rendering','Durango& PC, 55,重新打开]
它替换了,$ c的多个实例$ c> s在一行中,最后写入时,我们不再有原始的双引号。
And it replaces multiple instances of ,
s within a line, and when finally writing we no longer have the original double quotes.
这里是代码,考虑到 .txt
是您的输入文件,它将写入 out.txt
。
Here is the code, given that in.txt
is your input file and it will write to out.txt
.
import csv
with open('in.txt') as infile:
reader = csv.reader(infile)
with open('out.txt', 'w') as outfile:
for line in reader:
line = list(map(lambda s: s.replace(',', ' &'), line))
outfile.write(','.join(line) + '\n')
第四行输出为:
LTY-168499,[PC] [PS4] [XB1]缺少来自Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA的纹理-Character Tech& SOF-UPI,Durango&奥比斯PC,29,正在等待
这篇关于删除包含2个单词的引号并删除它们之间的逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!