需要在Python中的几个CSV文件中的每一行上进行数学运算 [英] Need to do a math operation on every line in several CSV files in Python

查看：583 发布时间：2017/2/24 21:22:31 python csv datestamp

本文介绍了需要在Python中的几个CSV文件中的每一行上进行数学运算的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有大约100个CSV文件，我每个月要操作一次，我试图包围我的头，但我跑到墙上。我开始理解关于Python的一些事情，但结合几件事情仍然给我的问题，所以我不能想出这个。

这里是我的问题： / p>

我有许多CSV文件，这里是我需要做的：

在前面添加一个每一行（或背面，无关紧要，但前面是理想的）。此外，每行有5行（不包括将要添加的文件名），格式如下：

6位ID号，YYYY-MM-DD （1），YYYY-MM-DD（2），YYYY-MM-DD（3），1-2位数字

我需要减去YYYY-MM

我需要在行内的文件名，因为我将组合文件（如果包含在脚本中将是真棒，但我想我可以计算出这一部分），我需要知道什么文件记录来自。文件名格式始终为4-5-digit-number.csv

我希望这有意义，如果没有，请让我知道。我有点担心在哪里甚至开始，所以我没有任何示例代码，甚至真的开始为我工作。真的很沮丧，所以我欣赏你们可能提供的任何帮助，这个网站岩石！

Mylan

解决方案

对于这些任务，标准库中有一个工具：

要迭代目录中的所有CSV文件，请使用 glob 模块：

  import glob 
用于glob.glob中的csvfilename（rC：\mydirectory\ * .csv）：
 #do_something

要解析CSV文件，请使用 csv 模块：

  import csv 
 with open（csvfilename，rb）as csvfile：
 reader = csv.reader（csvfile，delimiter =，）
 for row in reader：
＃row是当前行中所有条目的列表

要解析日期并计算差异，请使用 datetime 模块：

 来自datetime import datetime 
 startdate = datetime.strptime -10-20，％Y-％m-％d）
 enddate = datetime.strptime（2003-02-28，％Y-％m-％d）
 delta = enddate  -  startdate＃days in days

要向行的开头添加一个值： / p>

  row [0：0] = [str（delta）]

将文件名追加到行尾：

  row.append（csvfilename）

并将一行写入新的CSV文件：

 打开（csvfilename，wb）as csvfile：
 writer = csv.writer（csvfile，delimiter =，） 
 writer.writerow（row）

共同获得：

  import glob 
 import csv 
从datetime导入datetime 
 
打开（combined_files_csv ，wb）as outfile：
 writer = csv.writer（outfile，delimiter =，）
用于glob.glob中的csvfilename（rC：\ mydirectory \ * .csv ）：
 with open（csvfilename，rb）as infile：
 reader = csv.reader（infile，delimiter =，）
读取器中的行：
 startdate = datetime.strptime（row [3]，％Y-％m-％d）
 enddate = datetime.strptime（row [2]，％Y-％m-％d）
 delta = enddate  -  startdate＃days in days 
 row [0：0] = [str（delta）] 
 row.append（csvfilename）
 writer.writerow b

I have about 100 CSV files I have to operate on once a month and I was trying to wrap my head around this but I'm running into a wall. I'm starting to understand some things about Python, but combining several things is still giving me issues, so I can't figure this out.

Here's my problem:

I have many CSV files, and here's what I need done:

add a "column" to the front of each row (or the back, doesn't matter really, but front is ideal). In addition, each line has 5 rows (not counting the filename that will be added), and here's the format:

6-digit ID number,YYYY-MM-DD(1),YYYY-MM-DD(2),YYYY-MM-DD(3),1-2-digit number

I need to subtract YYYY-MM-DD(3) from YYYY-MM-DD(2) for every line in the file (there is no header row), for every CSV in a given directory.

I need the filename inside the row because I will combine the files (which, if is included in the script would be awesome, but I think I can figure that part out), and I need to know what file the records came from. Format of filename is always '4-5-digit-number.csv'

I hope this makes sense, if it does not, please let me know. I'm kind of stumped as to where to even begin, so I don't have any sample code that even really began to work for me. Really frustrated, so I appreciate any help you guys may provide, this site rocks!

Mylan

解决方案

There's a tool in the standard library for each of these tasks:

To iterate over all CSV files in a directory, use the glob module:

import glob
for csvfilename in glob.glob(r"C:\mydirectory\*.csv"):
    #do_something

To parse a CSV file, use the csv module:

import csv
with open(csvfilename, "rb") as csvfile:
    reader = csv.reader(csvfile, delimiter=",")
    for row in reader:
        # row is a list of all the entries in the current row

To parse a date and calculate a difference, use the datetime module:

from datetime import datetime
startdate = datetime.strptime("1999-10-20", "%Y-%m-%d")
enddate = datetime.strptime("2003-02-28", "%Y-%m-%d")
delta = enddate - startdate # difference in days

To add a value to the beginning of a row:

row[0:0] = [str(delta)]

To append the filename to the end of a row:

row.append(csvfilename)

And to write a row to a new CSV file:

with open(csvfilename, "wb") as csvfile:
    writer = csv.writer(csvfile, delimiter=",")
    writer.writerow(row)

Taken all together, you get:

import glob
import csv
from datetime import datetime

with open("combined_files_csv", "wb") as outfile:
    writer = csv.writer(outfile, delimiter=",")
    for csvfilename in glob.glob(r"C:\mydirectory\*.csv"):
        with open(csvfilename, "rb") as infile:
            reader = csv.reader(infile, delimiter=",")
            for row in reader:
                startdate = datetime.strptime(row[3], "%Y-%m-%d")
                enddate = datetime.strptime(row[2], "%Y-%m-%d")
                delta = enddate - startdate # difference in days
                row[0:0] = [str(delta)]
                row.append(csvfilename)
                writer.writerow(row)

这篇关于需要在Python中的几个CSV文件中的每一行上进行数学运算的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

需要在Python中的几个CSV文件中的每一行上进行数学运算 [英] Need to do a math operation on every line in several CSV files in Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

需要在Python中的几个CSV文件中的每一行上进行数学运算 [英] Need to do a math operation on every line in several CSV files in Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭