Python Pandas DtypeWarning在导入时指定dtype选项-如何? [英] Python Pandas DtypeWarning Specify dtype option on import - How?

查看:146
本文介绍了Python Pandas DtypeWarning在导入时指定dtype选项-如何?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这些专栏:

['Campaign', 'Ad group', 'Keyword', 'Status', 'Match type', 'Max. CPC', 'Quality score', 'Impressions', 'Clicks', 'CTR', 'Avg. CPC', 'Cost', 'Avg. position', 'Converted clicks', 'Click conversion rate', 'Cost / converted click', 'Bounce rate', 'Pages / session', 'Avg. session duration (seconds)', '% new sessions']

我收到的错误提示:

Warning (from warnings module):
  File "C:\Python34\lib\site-packages\pandas\io\parsers.py", line 1164
    data = self._reader.read(nrows)
DtypeWarning: Columns (5) have mixed types. Specify dtype option on import or set low_memory=False.

Columns (5)部分是什么意思?那是专栏位置吗? Campaign列是从位置0还是1开始?

What does the Columns (5) part mean? Is that the column position? Does Campaign column start at position 0 or 1?

此外,我怀疑此错误是因为我的Max. CPC列在几个区域中包含' --'而不是零.我希望此列数据类型为浮点型.

Also, I suspect this error is because my Max. CPC column has ' --' in a few areas instead of zeros. I want this column datatype to be a float. How do I translate these ' --' to 0.00 and also set this column as a float datatype when reading the CSV?

我尝试过:

import pandas as pd
import numpy as np

df = pd.read_csv('file.csv', dtype={'Max. CPC': pd.np.float64})

print(df.head())

但是出现ValueError:

But get a ValueError:

ValueError: could not convert string to float: ' --'

推荐答案

我可以想到2种方法,一种是传递

There are 2 approaches I can think of, one is to pass a list of values that read_csv can consider to treat as NaN values, this would convert those values in the list to be converted to NaN so that the dtype of that column remains as a float and not object:

df = pd.read_csv('file.csv', dtype={'Max. CPC': pd.np.float64}, na_values=[' --'])

然后可以将这些NaN值转换为0.00,调用

You can then convert these NaN values to 0.00 calling fillna:

df['Max. CPC'] = df['Max. CPC'].fillna(0.00)

另一个是像以前一样加载,并且 replace 将这些值更改为0.00:

The other is to load as before and replace these values to 0.00:

df['Max. CPC'] = df['Max. CPC'].replace(' --', 0.00)

这篇关于Python Pandas DtypeWarning在导入时指定dtype选项-如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆