将空csv列值替换为零 [英] Replacing empty csv column values with a zero

查看:1173
本文介绍了将空csv列值替换为零的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我处理一个csv文件有缺失值。
我想要的脚本是:

So I'm dealing with a csv file that has missing values. What I want my script to is:

#!/usr/bin/python

import csv
import sys

#1. Place each record of a file in a list.
#2. Iterate thru each element of the list and get its length.
#3. If the length is less than one replace with value x.


reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for x in row[:]:
                if len(x)< 1:
                         x = 0
                print x
print row

是一个数据的例子,我尝试它,理想情况下它应该工作在任何列lenghth

Here is an example of data, I trying it on, ideally it should work on any column lenghth

Before:
actnum,col2,col4
xxxxx ,    ,
xxxxx , 845   ,
xxxxx ,    ,545

After
actnum,col2,col4
xxxxx , 0  , 0
xxxxx , 845, 0
xxxxx , 0  ,545

更新以下是我现在的感谢:

Update Here is what I have now (thanks):

reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
print row

只是似乎出了一个记录,我将管道输出到一个新的文件在命令行。

However it only seems to out put one record, I will be piping the output to a new file on the command line.

更新3:好了现在我有相反的问题,我输出每个记录的重复。
为什么会发生这种情况?

Update 3: Ok now I have the opposite problem, I'm outputting duplicates of each records. Why is that happening?

After
actnum,col2,col4
actnum,col2,col4
xxxxx , 0  , 0
xxxxx , 0  , 0
xxxxx , 845, 0
xxxxx , 845, 0
xxxxx , 0  ,545
xxxxx , 0  ,545

Ok我修正了/ p>

Ok I fixed it (below) thanks you guys for your help.

#!/usr/bin/python

import csv
import sys

#1. Place each record of a file in a list.
#2. Iterate thru each element of the list and get its length.
#3. If the length is less than one replace with value x.


reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
    print ','.join(str(x) for x in row)


推荐答案

更改您的代码:

for row in reader:
    for x in row[:]:
                if len(x)< 1:
                         x = 0
                print x

into:

for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
                print x

你认为你是通过打印完成的,但关键的问题是你需要修改 row 为此目的,你需要一个索引,枚举给你。

Not sure what you think you're accomplishing by the print, but the key issue is that you need to modify row, and for that purpose you need an index into it, which enumerate gives you.

注意所有其他值,除了您要更改为 0 的空白字符串将保持字符串。如果你想把它们变成 int ,你必须明确这样做。

Note also that all other values, except the empty ones which you're changing into the number 0, will remain strings. If you want to turn them into ints you have to do that explicitly.

这篇关于将空csv列值替换为零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆