从redis封装Unicode [英] Encapsulating Unicode from redis

查看:54
本文介绍了从redis封装Unicode的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在第一个示例中,我们将两个 Unicode 字符串保存在一个文件中,同时将编码任务委托给编解码器.

# -*- 编码:utf-8 -*-导入编解码器城市 = [u'Düsseldorf', u'天津市']使用 codecs.open("cities", "w", "utf-8") 作为 f:对于城市中的 c:f.写(c)

我们现在做同样的事情,首先将两个名称保存到 redis,然后读取它们并将我们读取的内容保存到文件中.因为我们阅读的内容已经在 utf-8 中,所以我们跳过该部分的解码/编码.

# -*- 编码:utf-8 -*-导入redisr_server = redis.Redis('localhost') #, decode_responses = True)city_tag = u'Städte'城市 = [u'Düsseldorf', u'天津市']对于城市中的城市:r_server.sadd(cities_tag.encode('utf8'),city.encode('utf8'))with open(u'someCities.txt', 'w') as f:而 r_server.scard(cities_tag.encode('utf8')) != 0:city_utf8 = r_server.srandmember(cities_tag.encode('utf8'))f.write(city_utf8)r_server.srem(cities_tag.encode('utf8'), city_utf8)

如何更换线

r_server = redis.Redis('localhost')

r_server = redis.Redis('localhost', decode_responses = True)

在使用redis时要避免批量引入.encode/.decode?

解决方案

我不确定是否有问题.

如果您删除代码中的所有 .encode('utf8') 调用,它会生成一个正确的文件,即该文件与您当前代码生成的文件相同.><预><代码>>>>r_server = redis.Redis('本地主机')>>>r_server.keys()[]>>>r_server.sadd(u'Hauptstädte', u'东京', u'Godthåb',u'Москва')3>>>r_server.keys()['Hauptst\xc3\xa4dte']>>>r_server.smembers(u'Hauptstädte')set(['神\xc3\xa5b', '\xd0\x9c\xd0\xbe\xd1\x81\xd0\xba\xd0\xb2\xd0\xb0', '\xe6\x9d\xb1\xe4\xba\xac'])

这表明键和值是 UTF8 编码的,因此不需要 .encode('utf8').redis 模块的默认编码是 UTF8.这可以通过在创建客户端时传递编码来更改,例如redis.Redis('localhost', encoding='iso-8859-1'),但没有理由.

如果您使用 decode_responses=True 启用响应解码,那么响应将使用客户端连接的编码转换为 unicode.这只是意味着您不需要显式解码返回的数据,redis 会为您完成并返回一个 unicode 字符串:

<预><代码>>>>r_server = redis.Redis('localhost', decode_responses=True)>>>r_server.keys()[u'Hauptst\xe4dte']>>>r_server.smembers(u'Hauptstädte')set([u'Godth\xe5b', u'\u041c\u043e\u0441\u043a\u0432\u0430', u'\u6771\u4eac'])

因此,在第二个示例中,您将从 redis 检索到的数据写入文件,如果启用响应解码,则需要使用所需的编码打开输出文件.如果这是默认编码,那么您可以使用 open().否则,您可以使用 codecs.open() 或在写入文件之前手动编码数据.

导入编解码器city_tag = u'Hauptstädte'使用 codecs.open('capitals.txt', 'w', encoding='utf8') 作为 f:而 r_server.scard(cities_tag) != 0:城市 = r_server.srandmember(cities_tag)f.write(城市+'\n')r_server.srem(城市标签,城市)

In this first example we save two Unicode strings in a file while delegating to codecs the task of encoding them.

# -*- coding: utf-8 -*-
import codecs
cities = [u'Düsseldorf', u'天津市']
with codecs.open("cities", "w", "utf-8") as f:
    for c in cities:
        f.write(c)

We now do the same thing, first saving the two names to redis, then reading them back and saving what we've read to a file. Because what we've read is already in utf-8 we skip decoding/encoding for that part.

# -*- coding: utf-8 -*-
import redis
r_server = redis.Redis('localhost') #, decode_responses = True)
cities_tag = u'Städte'
cities = [u'Düsseldorf', u'天津市']
for city in cities:
    r_server.sadd(cities_tag.encode('utf8'),
                  city.encode('utf8'))

with open(u'someCities.txt', 'w') as f:
    while r_server.scard(cities_tag.encode('utf8')) != 0:
        city_utf8 = r_server.srandmember(cities_tag.encode('utf8'))
        f.write(city_utf8)
        r_server.srem(cities_tag.encode('utf8'), city_utf8)

How can I replace the line

r_server = redis.Redis('localhost')

with

r_server = redis.Redis('localhost', decode_responses = True)

to avoid the wholesale introduction of .encode/.decode when using redis?

解决方案

I'm not sure that there is a problem.

If you remove all of the .encode('utf8') calls in your code it produces a correct file, i.e. the file is the same as the one produced by your current code.

>>> r_server = redis.Redis('localhost')
>>> r_server.keys()
[]
>>> r_server.sadd(u'Hauptstädte', u'東京', u'Godthåb',u'Москва')
3
>>> r_server.keys()
['Hauptst\xc3\xa4dte']
>>> r_server.smembers(u'Hauptstädte')
set(['Godth\xc3\xa5b', '\xd0\x9c\xd0\xbe\xd1\x81\xd0\xba\xd0\xb2\xd0\xb0', '\xe6\x9d\xb1\xe4\xba\xac'])

This shows that keys and values are UTF8 encoded, therefore .encode('utf8') is not required. The default encoding for the redis module is UTF8. This can be changed by passing an encoding when creating the client, e.g. redis.Redis('localhost', encoding='iso-8859-1'), but there's no reason to.

If you enable response decoding with decode_responses=True then the responses will be converted to unicode using the client connection's encoding. This just means that you don't need to explicitly decode the returned data, redis will do it for you and give you back a unicode string:

>>> r_server = redis.Redis('localhost', decode_responses=True)
>>> r_server.keys()
[u'Hauptst\xe4dte']
>>> r_server.smembers(u'Hauptstädte')
set([u'Godth\xe5b', u'\u041c\u043e\u0441\u043a\u0432\u0430', u'\u6771\u4eac'])

So, in your second example where you write data retrieved from redis to a file, if you enable response decoding then you need to open the output file with the desired encoding. If this is the default encoding then you can just use open(). Otherwise you can use codecs.open() or manually encode the data before writing to the file.

import codecs

cities_tag = u'Hauptstädte'
with codecs.open('capitals.txt', 'w', encoding='utf8') as f:
    while r_server.scard(cities_tag) != 0:
        city = r_server.srandmember(cities_tag)
        f.write(city + '\n')
        r_server.srem(cities_tag, city)

这篇关于从redis封装Unicode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆