从CSV文件中读取数据,并从字符串转换为正确的数据类型,包括整数列表列 [英] Read data from CSV file and transform from string to correct data-type, including a list-of-integer column

查看:948
本文介绍了从CSV文件中读取数据,并从字符串转换为正确的数据类型,包括整数列表列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我从CSV文件读回数据时,每个单元格都被解释为字符串.

When I read data back in from a CSV file, every cell is interpreted as a string.

  • 如何自动将读取的数据转换为正确的类型?
  • 或者更好:我如何告诉csv阅读器每列的正确数据类型?

(我写了一个二维列表,其中每一列属于不同类型(布尔,str,int,整数列表),输出到CSV文件中.)

(I wrote a 2-dimensional list, where each column is of a different type (bool, str, int, list of integer), out to a CSV file.)

样本数据(以CSV文件格式):

Sample data (in CSV file):

IsActive,Type,Price,States
True,Cellphone,34,"[1, 2]"
,FlatTv,3.5,[2]
False,Screen,100.23,"[5, 1]"
True,Notebook, 50,[1]

推荐答案

作为

As the docs explain, the CSV reader doesn't perform automatic data conversion. You have the QUOTE_NONNUMERIC format option, but that would only convert all non-quoted fields into floats. This is a very similar behaviour to other csv readers.

我不认为Python的csv模块对于这种情况完全没有帮助.正如其他人已经指出的那样,literal_eval()是一个更好的选择.

I don't believe Python's csv module would be of any help for this case at all. As others have already pointed out, literal_eval() is a far better choice.

以下内容确实可以工作并进行转换:

The following does work and converts:

  • 字符串
  • int
  • 浮动
  • 列表
  • 字典

您也可以将其用于boolean和NoneType,尽管必须对它们进行相应的格式化以使literal_eval()通过. LibreOffice Calc在Python中将布尔值大写时,以大写字母显示布尔值.另外,您还必须用None(不带引号)

You may also use it for booleans and NoneType, although these have to be formatted accordingly for literal_eval() to pass. LibreOffice Calc displays booleans in capital letters, when in Python booleans are Capitalized. Also, you would have to replace empty strings with None (without quotes)

我正在为mongodb编写一个可以完成所有这些工作的导入程序.以下是我到目前为止编写的代码的一部分.

I'm writing an importer for mongodb that does all this. The following is part of the code I've written so far.

[注意:我的csv使用制表符作为字段定界符.您可能也想添加一些异常处理]

[NOTE: My csv uses tab as field delimiter. You may want to add some exception handling too]

def getFieldnames(csvFile):
    """
    Read the first row and store values in a tuple
    """
    with open(csvFile) as csvfile:
        firstRow = csvfile.readlines(1)
        fieldnames = tuple(firstRow[0].strip('\n').split("\t"))
    return fieldnames

def writeCursor(csvFile, fieldnames):
    """
    Convert csv rows into an array of dictionaries
    All data types are automatically checked and converted
    """
    cursor = []  # Placeholder for the dictionaries/documents
    with open(csvFile) as csvFile:
        for row in islice(csvFile, 1, None):
            values = list(row.strip('\n').split("\t"))
            for i, value in enumerate(values):
                nValue = ast.literal_eval(value)
                values[i] = nValue
            cursor.append(dict(zip(fieldnames, values)))
    return cursor

这篇关于从CSV文件中读取数据,并从字符串转换为正确的数据类型,包括整数列表列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆