Dask read_csv-在`pd.read_csv` /`pd.read_table`中发现不匹配的dtypes [英] Dask read_csv-- Mismatched dtypes found in `pd.read_csv`/`pd.read_table`

查看:436
本文介绍了Dask read_csv-在`pd.read_csv` /`pd.read_table`中发现不匹配的dtypes的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用dask读取csv文件,但出现了如下错误。但是问题是我希望我的 ARTICLE_ID object(string)。有人可以帮助我成功读取数据吗?

I'm trying to use dask to read csv file, and it gave me an error like below. But the thing is I want my ARTICLE_ID be object(string). Anyone can help me to read data successfully?

跟踪如下:

ValueError: Mismatched dtypes found in `pd.read_csv`/`pd.read_table`.

+------------+--------+----------+

| Column     | Found  | Expected |

+------------+--------+----------+

| ARTICLE_ID | object | int64    |

+------------+--------+----------+

The following columns also raised exceptions on conversion:

ARTICLE_ID:


ValueError("invalid literal for int() with base 10: ' July 2007 and 31 March 2008. Diagnostic practices of the medical practitioners for establishing the diagnosis of different types of EPTB were studied. Results: For the diagnosi\\\\'",)

Usually this is due to dask's dtype inference failing, and
*may* be fixed by specifying dtypes manually by adding:

dtype={'ARTICLE_ID': 'object'}

to the call to `read_csv`/`read_table`.


推荐答案

该消息提示您将呼叫从

df = dd.read_csv('mylocation.csv', ...)

df = dd.read_csv('mylocation.csv', ..., dtype={'ARTICLE_ID': 'object'})

您应该在此处更改文件位置以及之前使用的其他任何参数。如果仍然无法解决问题,请更新您的问题。

where you should change the file location and any other arguments to what you were using before. If this still doesn't work, then please update your question.

这篇关于Dask read_csv-在`pd.read_csv` /`pd.read_table`中发现不匹配的dtypes的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆