比较python中的各种CSV文件 [英] comparing varied CSV files in python
问题描述
假设我有2个CSV档案:
档案1:
表位名称,表位,蛋白质,位置,位置
3606,NSRSTSLSV,FOO,10,21
档案2:
A,B,C,D,E,F,G,H ,I,J,K
0,1,2,3,4,5,6,7,8,9,NSRSTSLSV
基本上,我想看看文件1中行1的内容是否在文件2的第10行中找到。如果内容匹配,我将打印第3个csv这是一个新版本的文件1,其中有一列名为found found or not found。
现在,我找不到一切,我知道不是这样。在某些情况下,文件1中的文本可能位于文件2中更大的文本块中。
这里是我到目前为止):
#usr / bin / python2.4
import csv
f1 = file('all_epitopes.csv','rb')
f2 = file('positiveBcell.csv','rb')
f3 = file('results.csv','w ')
c1 = csv.reader((f1),delimiter =,,quotechar ='')
c2 = csv.reader((f2),delimiter = ,quotechar ='')
c3 = csv.writer((f3),delimiter =,,quotechar ='')
positiveBcell =对于c2中的行]
对于c1中的all_epitopes_row:
row = 1
found = False
对于positiveBcell中的master_row:
results_row = all_epitopes_row
if all_epitopes_row [2] == positiveBcell [10]:
results_row.append('FOUND in Bcell List(row'+ str(row)+')')
found = True
break
row = row +1
如果没有找到:
results_row.append('BOUND列表中没有找到')
c3.writerow(results_row)
f1.close()
f2.close()
f3.close()
文件1:
表位名称,表位,蛋白质,位置,位置
#Row 1#
3606,NSRSTSLSV,FOO,10,21
档案2:
A,B,C,D,E,F,G,H,I,J,K
#Row 10#
0,1,2,3,4,5 ,6,7,8,9,NSRSTSLSV
OP的评论后:
对于c1中的all_epitopes_row:
row = 1
found = False
用于positiveBcell中的master_row:
results_row = all_epitopes_row
** if all_epitopes_row [2] == master_row [10]:**
results_row.append('FOUND in Bcell List(row'+ str(row)+')')
found = True
break
row = row +1
如果没有找到:
results_row.append('BOUND list'中没有找到)
c3.writerow (results_row)
Suppose I have 2 CSV files:
file 1:
Epitope Name,Epitope,Protein,position,position
3606,NSRSTSLSV,FOO,10,21
File 2:
A,B,C,D,E,F,G,H,I,J,K
0,1,2,3,4,5,6,7,8,9,NSRSTSLSV
Essentially, I want to see if the contents of row 1 in file 1 are found in row 10 of file 2. If the contents match, I'll print a 3rd csv that is a new version of file 1 with a column saying found or not found.
Right now, I'm getting not found for everything, which I know not to be the case. In some cases, the text from file 1 may be found inside a larger block of text from file 2.
Here's what I have so far (adapted from an answer found earlier):
#usr/bin/python2.4
import csv
f1 = file ('all_epitopes.csv', 'rb')
f2 = file ('positiveBcell.csv', 'rb')
f3 = file ('results.csv', 'w')
c1 = csv.reader((f1), delimiter=",", quotechar='"')
c2 = csv.reader((f2), delimiter=",", quotechar='"')
c3 = csv.writer((f3), delimiter=",", quotechar='"')
positiveBcell = [row for row in c2]
for all_epitopes_row in c1:
row = 1
found = False
for master_row in positiveBcell:
results_row = all_epitopes_row
if all_epitopes_row[2] == positiveBcell[10]:
results_row.append('FOUND in Bcell List (row ' + str(row) + ')')
found = True
break
row = row +1
if not found:
results_row.append('NOT FOUND in Bcell list')
c3.writerow(results_row)
f1.close()
f2.close()
f3.close()
Suppose your two files
file 1:
Epitope Name,Epitope,Protein,position,position
#Row 1#
3606,NSRSTSLSV,FOO,10,21
File 2:
A,B,C,D,E,F,G,H,I,J,K
#Row 10#
0,1,2,3,4,5,6,7,8,9,NSRSTSLSV
After OP's comment:
for all_epitopes_row in c1:
row = 1
found = False
for master_row in positiveBcell:
results_row = all_epitopes_row
**if all_epitopes_row[2] == master_row[10]:**
results_row.append('FOUND in Bcell List (row ' + str(row) + ')')
found = True
break
row = row +1
if not found:
results_row.append('NOT FOUND in Bcell list')
c3.writerow(results_row)
这篇关于比较python中的各种CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!