urllib.quote()引发KeyError [英] urllib.quote() throws KeyError

查看:269
本文介绍了urllib.quote()引发KeyError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用urllib.quote("schönefeld")来编码URI,但是当字符串中存在一些非ASCII字符时,它会鸣叫

To encode the URI, I used urllib.quote("schönefeld") but when some non-ascii characters exists in string, it thorws

KeyError: u'\xe9'
Code: return ''.join(map(quoter, s))

我的输入字符串是köln, brønshøj, schönefeld等.

当我尝试仅在Windows中打印语句时(使用python2.7,pyscripter IDE).但是在linux中,它会引发异常(我想平台并不重要).

When I tried just printing statements in windows(Using python2.7, pyscripter IDE). But in linux it raises exception (I guess platform doesn't matter).

这是我正在尝试的:

from commands import getstatusoutput
queryParams = "schönefeld";
cmdString = "http://baseurl" + quote(queryParams)
print getstatusoutput(cmdString)

探讨问题原因:urllib.quote()中,实际上是在return ''.join(map(quoter, s))处引发了异常.

Exploring the issue reason: in urllib.quote(), actually exception being throwin at return ''.join(map(quoter, s)).

urllib中的代码是:

The code in urllib is:

def quote(s, safe='/'):
    if not s:
        if s is None:
            raise TypeError('None object cannot be quoted')
        return s
     cachekey = (safe, always_safe)
     try:
         (quoter, safe) = _safe_quoters[cachekey]
     except KeyError:
         safe_map = _safe_map.copy()
         safe_map.update([(c, c) for c in safe])
         quoter = safe_map.__getitem__
         safe = always_safe + safe
         _safe_quoters[cachekey] = (quoter, safe)
      if not s.rstrip(safe):
         return s
      return ''.join(map(quoter, s))

出现异常的原因在''.join(map(quoter, s))中,对于s中的每个元素,都会调用quoter函数,最后列表将由''联接并返回.

The reason for exception is in ''.join(map(quoter, s)), for every element in s, quoter function will be called and finally the list will be joined by '' and returned.

对于非ASCII字符è,等效键为%E8,它出现在_safe_map变量中.但是,当我调用quote('è')时,它将搜索键\xe8.这样密钥就不存在,并且会引发异常.

For non-ascii char è, the equivalent key will be %E8 which presents in _safe_map variable. But when I am calling quote('è'), it searches for the key \xe8. So that the key does not exist and exception thrown.

因此,在try-except块中调用''.join(map(quoter, s))之前,我只是修改了s = [el.upper().replace("\\X","%") for el in s].现在可以正常工作了.

So, I just modifed s = [el.upper().replace("\\X","%") for el in s] before calling ''.join(map(quoter, s)) within try-except block. Now it works fine.

但是我很生气,我做的是正确的方法,否则还会造成其他问题吗? 而且我确实有200多个linux实例,很难在所有实例中部署此修复程序.

But I am annoying what I have done is correct approach or it will create any other issue? And also I do have 200+ instances of linux which is very tough to deploy this fix in all instances.

推荐答案

您正在尝试引用Unicode数据,因此需要确定如何将其转换为URL安全字节.

You are trying to quote Unicode data, so you need to decide how to turn that into URL-safe bytes.

首先将字符串编码为字节.经常使用UTF-8:

Encode the string to bytes first. UTF-8 is often used:

>>> import urllib
>>> urllib.quote(u'sch\xe9nefeld')
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py:1268: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  return ''.join(map(quoter, s))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1268, in quote
    return ''.join(map(quoter, s))
KeyError: u'\xe9'
>>> urllib.quote(u'sch\xe9nefeld'.encode('utf8'))
'sch%C3%A9nefeld'

但是,编码取决于服务器接受的内容.最好坚持使用原始表单发送时使用的编码.

However, the encoding depends on what the server will accept. It's best to stick to the encoding the original form was sent with.

这篇关于urllib.quote()引发KeyError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆