如何检查字符串是 unicode 还是 ascii? [英] How do I check if a string is unicode or ascii?
问题描述
我必须在 Python 中做什么才能确定字符串具有哪种编码?
What do I have to do in Python to figure out which encoding a string has?
推荐答案
在 Python 3 中,所有字符串都是 Unicode 字符序列.有一个 bytes
类型保存原始字节.
In Python 3, all strings are sequences of Unicode characters. There is a bytes
type that holds raw bytes.
在 Python 2 中,字符串可能是 str
类型或 unicode
类型.您可以使用以下代码判断哪个:
In Python 2, a string may be of type str
or of type unicode
. You can tell which using code something like this:
def whatisthis(s):
if isinstance(s, str):
print "ordinary string"
elif isinstance(s, unicode):
print "unicode string"
else:
print "not a string"
这不区分Unicode 或 ASCII";它只区分 Python 类型.Unicode 字符串可能由纯 ASCII 范围内的字符组成,而字节字符串可能包含 ASCII、编码的 Unicode,甚至非文本数据.
This does not distinguish "Unicode or ASCII"; it only distinguishes Python types. A Unicode string may consist of purely characters in the ASCII range, and a bytestring may contain ASCII, encoded Unicode, or even non-textual data.
这篇关于如何检查字符串是 unicode 还是 ascii?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!