尝试调用Google搜索API时出现Unicode错误 [英] Unicode error trying to call Google search API

查看:95
本文介绍了尝试调用Google搜索API时出现Unicode错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要执行Google搜索来检索查询的结果数.我在这里找到了答案-通过Python应用进行Google搜索

I need to perform google search to retrieve the number of results for a query. I found the answer here - Google Search from a Python App

但是,对于一些查询,我遇到以下错误.我认为查询具有Unicode字符.

However, for few queries I am getting the below error. I think the query has unicode characters.

UnicodeDecodeError:"ascii"编解码器无法解码位置28的字节0xc3:序数不在范围(128)中

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 28: ordinal not in range(128)

我搜索了google,发现我需要将unicode转换为ascii,并找到以下代码.

I searched google and found I need to convert unicode to ascii, and found below code.

def convertToAscii(text, action):
            temp = unicode(text, "utf-8")
            fixed = unicodedata.normalize('NFKD', temp).encode('ASCII', action)
            return fixed
    except Exception, errorInfo:
            print errorInfo
            print "Unable to convert the Unicode characters to xml character entities"
            raise errorInfo

如果我使用动作忽略,它将删除那些字符,但是如果我使用其他动作,则会出现异常.

If I use the action ignore, it removes those characters, but if I use other actions, I am getting exceptions.

任何想法,如何处理?

谢谢

==编辑== 我正在使用下面的代码进行编码,然后执行搜索,这引发了错误.

== Edit == I am using below code to encode and then perform the search and this is throwing the error.

query = urllib.urlencode({'q':searchfor})

query = urllib.urlencode({'q': searchfor})

推荐答案

您不能urlencode原始Unicode字符串.您需要先将它们编码为UTF-8,然后输入:

You cannot urlencode raw Unicode strings. You need to first encode them to UTF-8 and then feed to it:

query = urllib.urlencode({'q': u"München".encode('UTF-8')})

这将返回q=M%C3%BCnchen,Google乐意接受.

This returns q=M%C3%BCnchen which Google happily accepts.

这篇关于尝试调用Google搜索API时出现Unicode错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆