CSV到Python字典具有所有列名称? [英] CSV to Python Dictionary with all column names?

查看:152
本文介绍了CSV到Python字典具有所有列名称?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我还是新的使用python从头开始编程,所以作为一个练习,虽然我会采取一个文件,我使用SQL处理使用Python复制功能。看来,我想带我的(压缩,zip)csv文件,并创建一个Dict它(或也许一个词典的命令?)。当我使用dict读者,我得到第一行作为一个键,而不是每个列作为自己的键?例如

I'm still pretty new to using python to program from scratch so as an exercise I though I'd take a file that I process using SQL an try to duplicate the functionality using Python. It seems that I want to take my (compressed, zip) csv file and create a Dict of it (or maybe a dict of dicts?). When I use dict reader I get the 1st row as a key rather than each column as its own key? E.g.

import csv, sys, zipfile
sys.argv[0] = "/home/tom/Documents/REdata/AllListing1RES.zip"
zip_file    = zipfile.ZipFile(sys.argv[0])
items_file  = zip_file.open('AllListing1RES.txt', 'rU')

for row in csv.DictReader(items_file,dialect='excel'):
    pass

产量:

>>> for key in row:
        print 'key=%s, value=%s' % (key, row[key])

key=MLS_ACCT    PARCEL_ID   AREA    COUNTY  STREET_NUM  STREET_NAME CITY        ZIP STATUS  PROP_TYPE   LIST_PRICE  LIST_DATE   DOM DATE_MODIFIED   BATHS_HALF  BATHS_FULL  BEDROOMS    ACREAGE YEAR_BUILT  YEAR_BUILT_DESC OWNER_NAME  SOLD_DATE   WITHDRAWN_DATE  STATUS_DATE SUBDIVISION PENDING_DATE    SOLD_PRICE,  
value=492859    28-15-3-009-001.0000    200 JEFF    3828    ORLEANS RD  MOUNTAIN BROOK  35243   A   SFR 324900  3/3/2011    2   3/4/2011 12:04:11 AM    0   2   3   0   1968    EXIST   SPARKS          3/3/2011 11:54:56 PM    KNOLLWOOD

所以我要查找的是 MLS_ACCT 的列,以及 PARCEL_ID 等这样我就可以通过包含 KNOLLWOOD 细分字段中的所有项目做这样的事情的平均价格按日期范围,日期销售等进一步的子部分。

So what I'm looking for is a column for MLS_ACCT and a separate one for PARCEL_ID etc so I can then do things like average prices by all items that contain KNOLLWOOD in the SUBDIVISION field With a further sub section by date range, date sold etc.

我知道如何使用SQL,但正如我说的,我打算获得一些Python技能这里。
我在过去几天一直在阅读,但还没有找到任何非常简单的插图这种用例。指向所述文档的指南将不胜感激。我意识到我可以使用内存驻留的SQL-lite,但我的愿望是再次获得Python的方法学习。我读了一些关于Numpy和Scipy,并有sage加载,但仍然找不到一些有用的插图,因为这些工具似乎集中数组只有数字作为元素,我有很多字符串匹配我需要做,以及日期范围计算和比较。

I know well how to do it with SQL but As I said I'm tying to gain some Python skills here. I have been reading for the last few days but have yet to find any very simple illustrations on this sort of use case. Pointers to said docs would be appreciated. I realize I could use memory resident SQL-lite but again my desire is to get the Python approach learned.I've read some on Numpy and Scipy and have sage loaded but still can't find some useful illustrations since those tools seem focussed on arrays with only numbers as elements and I have a lot of string matching I need to do as well as date range calculations and comparisons.

最后,我需要替换表中的值(因为我有脏数据),我现在通过一个转换表包含所有脏变量

Eventually I'll need to substitute values in the table (since I have dirty data), I do this now by having a "translate table" which contains all dirty variants and provides a "clean" answer for final use.

推荐答案

您确定这是一个以逗号分隔值的文件吗?

Are you sure that this is a file with comma-separated values? It seems like the lines are being delimited by tabs.

如果这是正确的,请在 DictReader 构造。

If this is correct, specify a tab delimiter in the DictReader constructor.

for row in csv.DictReader(items_file, dialect='excel', delimiter='\t'):
    for key in row:
        print 'key=%s, value=%s' % (key, row[key])

资料来源: http://docs.python.org/library/ csv.html

这篇关于CSV到Python字典具有所有列名称?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆