python编写程序以迭代csv文件以匹配字段并将结果保存在其他数据文件中 [英] python writing program to iterate a csv file to match field and save the result in a different data file

查看:157
本文介绍了python编写程序以迭代csv文件以匹配字段并将结果保存在其他数据文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个程序来执行以下操作:

I am trying to write a program to do the following :

从csv文件中称为数据的记录中指定一个字段. 从csv文件中称为log的记录中指定一个字段.

specify a field from a record in a csv file called data. specify a field from a record in a csv file called log.

比较两者在数据和日志中的位置.如果它们在同一行上,则继续将记录写入名为log的文件中,并将结果写入名为result的新文件中. 如果该字段与日志文件中的记录位置不匹配,则继续移至日志文件中的下一条记录,并对其进行比较,直到找到匹配的记录,然后将该记录保存在名为result的文件中. 重置日志文件的索引 转到数据文件的下一行并继续进行验证,直到数据文件到达末尾为止.

compare the position of the two in the data and in the log. If they are on the same line proceed to write the record in the file called log in a new file called result. If the field does not match the record position in the log file proceed to move to the next record in the log file and compare it until a matching record is found and then the record is saved in the file called result. reset the index of the log file go to the next line in the data file and proceed to do the verification until the data file reaches the end.

这是我能做的,但是我被困住了

This is whay i was able to do but i am stuck

import csv
def main():

    datafile_csv = open('data.txt')
    logfile_csv = open('log.txt')
    row_data = []
    row_log = []
    row_log_temp = []
    index_data = 1
    index_log = 1
    index_log_temp = index_log
    counter = 0
    data = ''
    datareader = ''
    logreader = ''
    log = ''
#   row = 0
    logfile_len = sum (1 for lines in open('log.txt'))
    with open('resultfile.csv','w') as csvfile:
        out_write = csv.writer(csvfile,  delimiter=',',quotechar='"')
        with open('data.txt','r') as (data):
            row_data = csv.reader(csvfile, delimiter=',', quotechar='"')
            row_data = next(data)
            print(row_data)
            with open ('log.txt','r') as (log):
                row_log = next(log)
                print(row_log)
                while counter != logfile_len:
                    comp_data = row_data[index_data:]
                    comp_log = row_log[index_log:]
                    comp_data = comp_data.strip('"')
                    comp_log = comp_log.strip('"')
                    print(row_data[1])
                    print(comp_data)
                    print(comp_log)
                    if comp_data != comp_log:
                        while comp_data != comp_log:
                            row_log = next(log)
                            comp_log = row_log[index_log]
                        out_write.writerow(row_log)
                        row_data = next(data)
                    else : 
                        out_write.writerow(row_log)
                        row_data = next(data)
                    log.seek(0)
                    counter +=1

我遇到的问题如下:

我无法正确地将数据行转换为字符串,也无法正确比较.

I cannot convert the data line in a string properly and i cannot compare correctly.

我还需要能够重置日志文件中的指针,但是搜寻似乎不起作用....

Also i need to be able to reset the pointer in the log file but seek does not seem to be working....

这是数据文件的内容

"test1","test2","test3" "1","2","3" "4","5","6"

"test1","test2","test3" "1","2","3" "4","5","6"

这是日志文件的内容

"test1","test2","test3" "4","5","6" "1","2","3"

"test1","test2","test3" "4","5","6" "1","2","3"

这是编译器给我的回信

t "test1","test2","test3"

t "test1","test2","test3"

t test1," test2," test3"

t test1","test2","test3"

test1","test2","test3"

test1","test2","test3"

1 1," 2," 3"

1 1","2","3"

test1","test2","test3"

test1","test2","test3"

Traceback (most recent call last):
File "H:/test.py", line 100, in <module>
main()
File "H:/test.py", line 40, in main
comp_log = row_log[index_log]
IndexError: string index out of range

非常感谢您的帮助

致谢

Danilo

推荐答案

按列(行数和特定列[未定义])连接两个文件,并返回结果限制在左/第一个文件的列./p>

Joining two files by columns (rowcount and a Specific Column[not defined]), and returning the results limited to the columns of the left/first file.

import petl

log = petl.fromcsv('log.txt').addrownumbers()  # Load csv/txt file into PETL table, and add row numbers 
log_columns = len(petl.header(log))  # Get the amount of columns in the log file
data = petl.fromcsv('data.txt').addrownumbers()  # Load csv/txt file into PETL table, and add row numbers 
joined_files = petl.join(log, data, key=['row', 'SpecificField'])  # Join the tables using row and a specific field
joined_files = petl.cut(joined_files, *range(1, log_columns))  # Remove the extra columns obtained from right table
petl.tocsv(joined_files, 'resultfile.csv')  # Output results to csv file

log.txt

data.txt

resultfile.csv

resultfile.csv

也不要忘记安装pip(此示例使用的版本):

Also Do not forget to pip install (version used for this example):

pip install petl==1.0.11

这篇关于python编写程序以迭代csv文件以匹配字段并将结果保存在其他数据文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆