比较python中的各种CSV文件 [英] comparing varied CSV files in python

查看：135 发布时间：2017/2/26 17:18:44 python csv bioinformatics

本文介绍了比较python中的各种CSV文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有2个CSV档案：

档案1：

 表位名称，表位，蛋白质，位置，位置
 
 3606，NSRSTSLSV，FOO，10,21

档案2：

  A，B，C，D，E，F，G，H ，I，J，K 
 
 0,1,2,3,4,5,6,7,8,9，NSRSTSLSV

基本上，我想看看文件1中行1的内容是否在文件2的第10行中找到。如果内容匹配，我将打印第3个csv这是一个新版本的文件1，其中有一列名为found found or not found。

现在，我找不到一切，我知道不是这样。在某些情况下，文件1中的文本可能位于文件2中更大的文本块中。

这里是我到目前为止）：

 ＃usr / bin / python2.4 
 
 import csv 
 
 f1 = file（'all_epitopes.csv'，'rb'）
 f2 = file（'positiveBcell.csv'，'rb'）
 f3 = file（'results.csv'，'w '）
 
 c1 = csv.reader（（f1），delimiter =，，quotechar =''）
 c2 = csv.reader（（f2），delimiter = ，quotechar =''）
 c3 = csv.writer（（f3），delimiter =，，quotechar =''）
 
 
 positiveBcell =对于c2中的行] 
 
对于c1中的all_epitopes_row：
 row = 1 
 found = False 
对于positiveBcell中的master_row：
 results_row = all_epitopes_row 
 if all_epitopes_row [2] == positiveBcell [10]：
 results_row.append（'FOUND in Bcell List（row'+ str（row）+'）'）
 found = True 
 break 
 row = row +1 
如果没有找到：
 results_row.append（'BOUND列表中没有找到'）
 c3.writerow（results_row）
 
 f1.close（）
 f2.close（）
 f3.close（）

文件1：

 表位名称，表位，蛋白质，位置，位置
 
 #Row 1＃
 3606，NSRSTSLSV，FOO，10,21

档案2：

  A，B，C，D，E，F，G，H，I，J，K 
 
 #Row 10＃
 0,1,2,3,4,5 ，6,7,8,9，NSRSTSLSV

OP的评论后：

 对于c1中的all_epitopes_row：
 row = 1 
 found = False 
用于positiveBcell中的master_row：
 results_row = all_epitopes_row 
 ** if all_epitopes_row [2] == master_row [10]：** 
 results_row.append（'FOUND in Bcell List（row'+ str（row）+'）'）
 found = True 
 break 
 row = row +1 
如果没有找到：
 results_row.append（'BOUND list'中没有找到）
 c3.writerow （results_row）

Suppose I have 2 CSV files:

file 1:

Epitope Name,Epitope,Protein,position,position

3606,NSRSTSLSV,FOO,10,21

File 2:

A,B,C,D,E,F,G,H,I,J,K

0,1,2,3,4,5,6,7,8,9,NSRSTSLSV

Essentially, I want to see if the contents of row 1 in file 1 are found in row 10 of file 2. If the contents match, I'll print a 3rd csv that is a new version of file 1 with a column saying found or not found.

Right now, I'm getting not found for everything, which I know not to be the case. In some cases, the text from file 1 may be found inside a larger block of text from file 2.

Here's what I have so far (adapted from an answer found earlier):

#usr/bin/python2.4

import csv

f1 = file ('all_epitopes.csv', 'rb')
f2 = file ('positiveBcell.csv', 'rb')
f3 = file ('results.csv', 'w')

c1 = csv.reader((f1), delimiter=",", quotechar='"')
c2 = csv.reader((f2), delimiter=",", quotechar='"')
c3 = csv.writer((f3), delimiter=",", quotechar='"')


positiveBcell = [row for row in c2]

for all_epitopes_row in c1:
    row = 1
    found = False
    for master_row in positiveBcell:
        results_row = all_epitopes_row
        if all_epitopes_row[2] == positiveBcell[10]:
            results_row.append('FOUND in Bcell List (row ' + str(row) + ')')
            found = True
            break
        row = row +1
    if not found:
        results_row.append('NOT FOUND in Bcell list')
    c3.writerow(results_row)

f1.close()
f2.close()
f3.close()

解决方案

Suppose your two files

file 1:

Epitope Name,Epitope,Protein,position,position

#Row 1#
3606,NSRSTSLSV,FOO,10,21

File 2:

A,B,C,D,E,F,G,H,I,J,K

#Row 10#
0,1,2,3,4,5,6,7,8,9,NSRSTSLSV

After OP's comment:

for all_epitopes_row in c1:
    row = 1
    found = False
    for master_row in positiveBcell:
        results_row = all_epitopes_row
        **if all_epitopes_row[2] == master_row[10]:**
            results_row.append('FOUND in Bcell List (row ' + str(row) + ')')
            found = True
            break
        row = row +1
    if not found:
        results_row.append('NOT FOUND in Bcell list')
    c3.writerow(results_row)

这篇关于比较python中的各种CSV文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

比较python中的各种CSV文件 [英] comparing varied CSV files in python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

比较python中的各种CSV文件 [英] comparing varied CSV files in python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭