r“字符串"b“字符串"u“字符串"Python 2/3 比较 [英] r"string" b"string" u"string" Python 2 / 3 comparison

查看:85
本文介绍了r“字符串"b“字符串"u“字符串"Python 2/3 比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经知道 Python 2.7 中的 r"string" 通常用于正则表达式模式.我还看到了 u"string" 用于 Unicode 字符串.现在使用 Python 3 我们看到 b"string".

我在不同的来源/问题中搜索过这些,例如 python 字符串前的 ab 前缀是什么意思?,但是在 Python 中很难看到所有这些带前缀的字符串的大图,尤其是在 Python 2 和 3 中.>

问题:您是否有一个经验法则来记住 Python 中带前缀的不同类型的字符串?(或者可能是一个包含 Python 2 列和 Python 3 列的表格?)

注意:我已经阅读了一些问题+答案,但我还没有找到与所有前缀/Python 2+3 的易于记忆的比较

解决方案

来自文字的 python 文档:https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals

<块引用>

字节文字总是以'b'或'B'为前缀;他们生产一个bytes 类型而不是 str 类型的实例.他们可能只包含 ASCII 字符;数值为 128 或更大的字节必须用转义符表示.

字符串和字节文字都可以选择以前缀为前缀字母r"或R";这样的字符串称为原始字符串并处理反斜杠作为文字字符.因此,在字符串文字中,未对原始字符串中的 '\U' 和 '\u' 转义进行特殊处理.给定的Python 2.x 的原始 unicode 文字的行为与 Python 不同不支持 3.x 的 'ur' 语法.

<块引用>

前缀为 'f' 或 'F' 的字符串文字是格式化字符串文字;请参阅格式化字符串文字.f"可以与'r',但不能与 'b' 或 'u' 一起使用,因此原始格式的字符串是可能,但格式化的字节文字不是.

所以:

  • r 表示 原始
  • b 表示字节
  • u 表示 unicode
  • f 表示格式

rb 已经在 Python 2 中可用,在许多其他语言中也是如此(它们有时非常方便).

由于字符串文字在 Python 2 中不是 unicode,因此创建了 u 字符串以提供对国际化的支持.从 Python 3 开始,u-strings 是默认字符串,所以 "..." 在语义上与 u"..." 相同.

最后,f-string 是 Python 2 中唯一不支持的.

I already know r"string" in Python 2.7 often used for regex patterns. I also have seen u"string" for, I think, Unicode strings. Now with Python 3 we see b"string".

I have searched for these in different sources / questions, such as What does a b prefix before a python string mean?, but it's difficult to see the big picture of all these strings with prefixes in Python, especially with Python 2 vs 3.

Question: would you have a rule of thumb to remember the different types of strings with prefixes in Python? (or maybe a table with a column for Python 2 and one for Python 3?)

NB: I have read a few questions+answers but I haven't found an easy to remember comparison with all prefixes / Python 2+3

解决方案

From the python docs for literals: https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals

Bytes literals are always prefixed with 'b' or 'B'; they produce an instance of the bytes type instead of the str type. They may only contain ASCII characters; bytes with a numeric value of 128 or greater must be expressed with escapes.

Both string and bytes literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and treat backslashes as literal characters. As a result, in string literals, '\U' and '\u' escapes in raw strings are not treated specially. Given that Python 2.x’s raw unicode literals behave differently than Python 3.x’s the 'ur' syntax is not supported.

and

A string literal with 'f' or 'F' in its prefix is a formatted string literal; see Formatted string literals. The 'f' may be combined with 'r', but not with 'b' or 'u', therefore raw formatted strings are possible, but formatted bytes literals are not.

So:

  • r means raw
  • b means bytes
  • u means unicode
  • f means format

The r and b were already available in Python 2, as such in many other languages (they are very handy sometimes).

Since the strings literals were not unicode in Python 2, the u-strings were created to offer support for internationalization. As of Python 3, u-strings are the default strings, so "..." is semantically the same as u"...".

Finally, from those, the f-string is the only one that isn't supported in Python 2.

这篇关于r“字符串"b“字符串"u“字符串"Python 2/3 比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆