python& MySql:Unicode和编码 [英] Python & MySql: Unicode and Encoding

查看:80
本文介绍了python& MySql:Unicode和编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在解析json数据,并尝试将一些json数据存储到Mysql数据库中.我目前正在关注unicode错误.我的问题是我应该如何处理.

I am parsing json data and trying to store some of the json data into Mysql database. I am currently getting following unicode error. My question is how should I handle this.

  • 我应该从数据库端处理它吗,如果可以的话,如何修改我的表呢?
  • 我应该从python端处理它吗?

这是我的表结构

CREATE TABLE yahoo_questions (
   question_id varchar(40) NOT NULL, 
   question_subj varbinary(255), 
   question_content varbinary(255),
   question_userId varchar(40) NOT NULL,
   question_timestamp varchar(40),
   category_id varbinary(20) NOT NULL,
   category_name varchar(40) NOT NULL,
   choosen_answer varbinary(255),
   choosen_userId varchar(40),
   choosen_usernick varchar(40),
   choosen_ans_timestamp varchar(40),
   UNIQUE (question_id)
);

通过python代码插入时出错:

Error While inserting via python code:

Traceback (most recent call last):
  File "YahooQueryData.py", line 78, in <module>
    +"VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)", (row[2], row[5], row[6], quserId, questionTime, categoryId, categoryName, qChosenAnswer, choosenUserId, choosenNickName, choosenTimeStamp))
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/MySQLdb/cursors.py", line 159, in execute
    query = query % db.literal(args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/MySQLdb/connections.py", line 264, in literal
    return self.escape(o, self.encoders)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/MySQLdb/connections.py", line 202, in unicode_literal
    return db.literal(u.encode(unicode_literal.charset))
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 204-230: ordinal not in range(256)

Python代码段:

Python Code segment:

    #pushing user id to the url to get full json stack
    urlobject = urllib.urlopen(base_url.format(row[2]))
    qnadatajson = urlobject.read()
    data = json.loads(qnadatajson)
cur.execute("INSERT INTO yahoo_questions (question_id, question_subj, question_content, question_userId, question_timestamp,"
            +"category_id, category_name, choosen_answer, choosen_userId, choosen_usernick, choosen_ans_timestamp)"
            +"VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)", (row[2], row[5], row[6], quserId, questionTime, categoryId, categoryName, qChosenAnswer, choosenUserId, choosenNickName, choosenTimeStamp))

json结构

questions: [
{
Id: "20111201185322AA5HTDc",
Subject: "what are the new pokemon call?",
Content: "I used to know them I stop at dialga and palkia version and I heard there's new ones what's it call
",
Date: "2011-12-01 18:53:22",
Timestamp: "1322794402",

在运行查询之前我也做了什么,我在mysql SET character_set_client = utf8

What I also did prior to running the query I execute the following on mysql SET character_set_client = utf8

这就是mysql变量的样子:

And this how the mysql variables looks like:

mysql> SHOW variables LIKE '%character_set%';
+--------------------------+--------------------------------------------------------+
| Variable_name            | Value                                                  |
+--------------------------+--------------------------------------------------------+
| character_set_client     | utf8                                                   |
| character_set_connection | utf8                                                   |
| character_set_database   | latin1                                                 |
| character_set_filesystem | binary                                                 |
| character_set_results    | utf8                                                   |
| character_set_server     | latin1                                                 |
| character_set_system     | utf8                                                   |
| character_sets_dir       | /usr/local/mysql-5.5.10-osx10.6-x86_64/share/charsets/ |
+--------------------------+--------------------------------------------------------+
8 rows in set (0.00 sec)

推荐答案

我认为您的MYSQLdb python库不知道它应该编码为utf8,并且正在编码为默认的python系统定义的字符集latin1.

I think that your MYSQLdb python library doesn't know it's supposed to encode to utf8, and is encoding to the default python system-defined charset latin1.

当您将connect()移至数据库时,请传递charset='utf8'参数.这也应该使手册SET NAMESSET character_set_client不必要.

When you connect() to your database, pass the charset='utf8' parameter. This should also make a manual SET NAMES or SET character_set_client unnecessary.

这篇关于python&amp; MySql:Unicode和编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆