删除非Unicode字符python [英] Deleting non unicode characters python
问题描述
我正在尝试返回一个请求,但它给我一个错误,指出字符串中包含非Unicode字符.我正在过滤掉它们,但随后它使字符串成为unicode样式,从而导致应用程序格式化响应错误而崩溃.
I am trying to return a request but it is giving me an error that there are non-unicode characters in the string. I am filtering them out but then it makes the string in unicode style which crashes the app with a badly formatted response.
这就是我想要做的
unfiltered_string = str({'location_id': location.pk, 'name': location.location_name,'address': location.address+', '+location.locality+', '+location.region+' '+location.postcode, 'distance': location.distance.mi, })
filtered_string = str(filter(lambda x: x in string.printable, unfiltered_string)).encode("utf-8")
locations.append(filtered_string)
麻烦的是它附加了一个类似于
The troubles is it appends a string that looks like
{'distance': 4.075068111513138, 'location_id': 1368, 'name': u'Stanford University', 'address': u'450 Serra Mall, Stanford, CA 94305'}
当我需要u'string'只是这样的'string'
when I need the u'string' to just be 'string' like this
{'distance': 4.075068111513138, 'location_id': 1368, 'name': 'Stanford University', 'address': '450 Serra Mall, Stanford, CA 94305'}
如果我尝试使用 string.encode('ascii','ignore')
,那么我仍然会得到
if I try using string.encode('ascii','ignore')
then I still get
"{'location_id': 1368, 'address': u'450 Serra Mall, Stanford, CA 94305', 'distance': 4.075068111513138, 'name': u'Stanford University'}"
现在我在json周围得到了额外的报价
and now I get extra quotations around the json
推荐答案
因此,我在这里走了个弯腰,说您的目标是忽略您拥有的Unicode特定字符.我认为要在问题中没有更好的解释就很难说出任何确定性的内容,但是如果您希望获取纯"字符串而不是unicode字符串,我建议使用 ascii
编解码器进行而不是 utf-8
.
So, I'm going to go out on a limb here and say that your goal here is to ignore the unicode specific characters that you've got. I think it's really difficult to say anything definitive without a better explanation in your question, but if you're looking to get a "plain" string instead of a unicode one I would suggest using the ascii
codec for encoding instead of utf-8
.
<str>.encode('ascii')
如果要删除其他字符, encode
函数采用可选的第二个参数,使您可以忽略指定编解码器无法处理的所有字符:
If you want to remove the other characters, the encode
function takes an optional second argument allowing you to ignore all characters that the specified codec can't handle:
<str>.encode('ascii', 'ignore')
这篇关于删除非Unicode字符python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!