pandas to_sql给出unicode解码错误 [英] pandas to_sql gives unicode decode error

查看:182
本文介绍了 pandas to_sql给出unicode解码错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个通过read_csv加载的熊猫数据框,我试图在尝试时通过to_sql将其推送到数据库

I have a pandas dataframe I loaded via read_csv that I am trying to push to a database via to_sql when I attempt

df.to_sql("assessmentinfo_pivot", util.ENGINE)

我找回unicodeDecodeError:

I get back a unicodeDecodeError:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 83-84: ordinal not in range(128)

to_sql没有用于为to_sql指定utf-8的编码选项,并且创建引擎时的编码设置为utf-8

There is no encoding option for to_sql to specify utf-8 for the to_sql and the Engine was created with encoding set to utf-8

ENGINE = create_engine("mssql+pymssql://" +
                       config.get_local('CEDS_USERNAME') + ':' +
                       config.get_local('CEDS_PASSWORD') + '@' +
                       config.get_local('CEDS_SERVER') + '/' +
                       config.get_local('CEDS_DATABASE'),
                       encoding="utf-8")

是否有任何熊猫能使它正常工作的见解?我搜索的大多数内容都导致我发现to_csv有类似错误,只需添加encoding ="utf-8"即可解决,但不幸的是,这里不是一个选择.

Any pandas insight into getting this working properly? most of my searched lead me to people having a similar error for to_csv which is just resolved by adding encoding="utf-8" but that is unfortunately not an option here.

我尝试剖析文件,但即使将其简化为标题也仍然会出错: http://pastebin. com/F362xGyP

I tried paring the file down but it still gives errors even when stripped down to just the headers: http://pastebin.com/F362xGyP

推荐答案

我在使用pymysql和pandas.to_sql

I experienced the exact same issue with the combination pymysql and pandas.to_sql

更新,这对我有用:

不要将字符集作为参数传递,而是尝试将其直接附加到连接字符串:

Instead of passing the charset as an argument, try attaching it directly to the connection string:

connect_string = 'mysql+pymysql://{}:{}@{}:{}/{}?charset=utf8'.format(DB_USER, DB_PASS, DB_HOST, DB_PORT, DATABASE)

问题似乎发生在pymysql中,并且错误的原因似乎是在设置pymsql连接时未正确转发和设置您定义的编码.

The problem seems to happen in pymysql and the cause for the error seemingly is that the encoding you define is not properly forwarded and set when the pymsql connection is set.

为了调试,我进行了

encoding = 'utf-8

,这对我解释了.

这篇关于 pandas to_sql给出unicode解码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆