Python将unicode字符串转换并保存到列表 [英] Python convert and save unicode string to a list

查看:133
本文介绍了Python将unicode字符串转换并保存到列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在列表中插入一系列名称(例如'Alam \ xc3 \ xa9'),然后我必须将它们保存到SQLite数据库中.

I need to insert a series of names (like 'Alam\xc3\xa9') into a list, and than I have to save them into a SQLite database.

我知道我可以通过提示正确显示这些名称:

I know that I can render these names correctly by tiping:

print eval(repr(NAME)).decode("utf-8")

但是我必须将它们插入列表中,所以我不能使用打印

But I have to insert them into a list, so I can't use the print

其他无需打印的方式吗?

Other way for doing this without the print?

推荐答案

这里有很多误解.

您引用的字符串是 not Unicode.这是一个字节字符串,以UTF-8编码.

The string you quote is not Unicode. It is a byte string, encoded in UTF-8.

您可以通过解码将其转换为Unicode:

You can convert it to Unicode by decoding it:

unicode_name = name.decode('utf-8')

在控制台上打印unicode_name的值时,您将看到以下两种情况之一:

When you print the value of unicode_name to the console, you will see one of two things:

>>> unicode_name
u'Alam\xe9'
>>> print unicode_name
Alamé

在这里,您可以看到仅键入名称并按Enter即可显示Unicode代码点的表示形式.这与键入print repr(unicode_name)相同.但是,执行print unicode_name会打印实际的字符-即在幕后,将其编码为适合终端的正确编码,然后打印结果.

Here, you can see that just typing the name and pressing enter shows a representation of the Unicode code points. This is the same as typing print repr(unicode_name). However, doing print unicode_name prints the actual characters - ie behind the scenes, it encodes it to the correct encoding for your terminal, and prints the result.

但这并不重要,因为Unicode字符串只能在内部表示.要将其存储在数据库,文件或任何位置中后,就需要对其进行编码.而且最有可能选择的编码是UTF-8-原来是这种编码.

But this is all irrelevant, because Unicode strings can only be represented internally. As soon as you want to store it in a database, or a file, or anywhere, you need to encode it. And the most likely encoding to choose is UTF-8 - which is what it was in originally.

>>> name
'Alam\xc3\xa9'
>>> print name
Alamé

如您所见,

使用名称的原始未解码版本,reprprint再次显示代码和字符.因此,并不是说将其转换为Unicode实际上会使它真正地"成为了正确的字符.

As you can see, using the original non-decoded version of the name, repr and print once again show the codes and the characters. So it's not that converting it to Unicode actually makes it any more "really" the correct character.

那么,如果要将其存储在数据库中怎么办?没有.没事Sqlite接受UTF-8输入,并将其数据以UTF-8格式存储在磁盘上.因此,完全不需要任何转换即可将name的原始值存储在数据库中.

So, what to do if you want to store it in a database? Nothing. Nothing at all. Sqlite accepts UTF-8 input, and stores its data in UTF-8 format on the disk. So there is absolutely no conversion needed to store the original value of name in the database.

这篇关于Python将unicode字符串转换并保存到列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆