将CSV导入Python [英] Importing CSV into Python

查看:59
本文介绍了将CSV导入Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的CSV数据集:

  FirstAge,SecondAge,FirstCountry,SecondCountry,收入,NAME41,41,USA,UK,113764,John53,43,USA,USA,145963,Fred47,37,USA,UK,42857,Dan47,44,UK,USA,95352,Mark 

我正在尝试使用以下代码将其加载到Python 3.6中:

 >>>从numpy import genfromtxt>>>my_data = genfromtxt('first.csv',delimiter =',')>>>打印(train_data) 

输出:

  [[nan nan nan nan楠楠][4.10000000e + 01 4.10000000e + 01 nan nan1.13764000e + 05 nan][5.30000000e + 01 4.30000000e + 01 nan nan1.45963000e + 05 nan]...,[2.10000000e + 01 3.00000000e + 01 nan nan1.19929000e + 05 nan][6.90000000e + 01 6.40000000e + 01 nan nan1.52667000e + 05 nan][2.00000000e + 01 1.90000000e + 01 nan nan1.05077000e + 05 nan]] 

我看过Numpy文档,对此一无所获.

解决方案

我认为您可能会遇到的问题是,您尝试解析的数据并非全部为数字,这可能会导致意外行为./p>

检测类型的一种方法是在将类型添加到数组之前尝试识别它们.例如:

my_data中obj的

 :如果type(obj)== int:#处理或将数据添加到numpy别的:#投射或丢弃数据 

I have a CSV dataset that looks like this:

FirstAge,SecondAge,FirstCountry,SecondCountry,Income,NAME
41,41,USA,UK,113764,John
53,43,USA,USA,145963,Fred
47,37,USA,UK,42857,Dan
47,44,UK,USA,95352,Mark  

I'm trying to load it into Python 3.6 with this code:

>>> from numpy import genfromtxt

>>> my_data = genfromtxt('first.csv', delimiter=',')
>>> print(train_data)

Output:

 [[             nan              nan              nan              nan
               nan              nan]
 [  4.10000000e+01   4.10000000e+01              nan              nan
    1.13764000e+05              nan]
 [  5.30000000e+01   4.30000000e+01              nan              nan
    1.45963000e+05              nan]
 ..., 
 [  2.10000000e+01   3.00000000e+01              nan              nan
    1.19929000e+05              nan]
 [  6.90000000e+01   6.40000000e+01              nan              nan
    1.52667000e+05              nan]
 [  2.00000000e+01   1.90000000e+01              nan              nan
    1.05077000e+05              nan]]

I've looked at the Numpy docs and I don't see anything about this.

解决方案

I think the an issue that you could be running into is the data that you are trying to parse is not all numerics and this could potentially cause unexpected behavior.

One way to detect the types would be to try and identify the types before they are added to your array. For example:

for obj in my_data:
    if type(obj) == int:
        # process or add your data to numpy
    else:
        # cast or discard the data

这篇关于将CSV导入Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆