Python expandtabs 字符串操作 [英] Python expandtabs string operation

查看:49
本文介绍了Python expandtabs 字符串操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习 Python 并使用 Python 中的 expandtabs 命令.这是文档中的官方定义:

<块引用>

string.expandtabs(s[, tabsize])

展开字符串中的制表符,将它们替换为一个或多个空格,具体取决于当前列和给定的制表符大小.在字符串中出现每个换行符后,列号将重置为零.这不理解其他非打印字符或转义序列.标签大小默认为 8.

所以我从中了解到的是选项卡的默认大小是 8 并且要增加它,我们可以使用其他值

所以,当我在 shell 中尝试时,我尝试了以下输入 -

<预><代码>>>>str = "这是\tstring">>>打印 str.expandtabs(0)这是字符串>>>打印 str.expandtabs(1)这是字符串>>>打印 str.expandtabs(2)这是字符串>>>打印 str.expandtabs(3)这是字符串>>>打印 str.expandtabs(4)这是字符串>>>打印 str.expandtabs(5)这是字符串>>>打印 str.expandtabs(6)这是字符串>>>打印 str.expandtabs(7)这是字符串>>>打印 str.expandtabs(8)这是字符串>>>打印 str.expandtabs(9)这是字符串>>>打印 str.expandtabs(10)这是字符串>>>打印 str.expandtabs(11)这是字符串

这里,

  • 0 完全删除制表符,
  • 1 与默认的 8 完全一样,
  • 但是21完全一样,然后
  • 3 不同
  • 然后再4就像使用1

然后它一直增加到 8 这是默认值,然后在 8 之后增加.但是为什么数字从 0 到 8 的奇怪模式?我知道应该从 8 点开始,但原因是什么?

解决方案

str.expandtabs(n) 不等同于 str.replace("\t", " " * n).

str.expandtabs(n) 跟踪每一行的当前光标位置,并用从当前光标位置到下一个制表位的空格数替换它找到的每个制表符.制表位被视为每 n 个字符.

这是选项卡工作方式的基础,并不特定于 Python.请参阅相关问题的答案,了解制表位的详细解释.

string.expandtabs(n) 等价于:

def expandtabs(string, n):结果 = ""位置 = 0对于字符串中的字符:如果字符 == "\t":# 代替制表符,附加# 到下一个制表位的空格数char = " " * (n - pos % n)位置 = 0elif 字符 == "\n":位置 = 0别的:位置 += 1结果 += 字符返回结果

以及使用示例:

<预><代码>>>>输入 = "123\t12345\t1234\t1\n12\t1234\t123\t1">>>打印(扩展标签(输入,10))123 12345 1234 112 1234 123 1

注意每个制表符 ("\t") 是如何被空格数替换的,从而使其与下一个制表位对齐.在这种情况下,每 10 个字符有一个制表位,因为我提供了 n=10.

I am learning about Python and got to the expandtabs command in Python. This is the official definition in the docs:

string.expandtabs(s[, tabsize])

Expand tabs in a string replacing them by one or more spaces, depending on the current column and the given tab size. The column number is reset to zero after each newline occurring in the string. This doesn’t understand other non-printing characters or escape sequences. The tab size defaults to 8.

So what I understood from that is that the default size of tabs is 8 and to increase that, we can use other values

So, when I tried that in the shell, I tried the following inputs -

>>> str = "this is\tstring"
>>> print str.expandtabs(0)
this isstring
>>> print str.expandtabs(1)
this is string
>>> print str.expandtabs(2)
this is string
>>> print str.expandtabs(3)
this is  string
>>> print str.expandtabs(4)
this is string
>>> print str.expandtabs(5)
this is   string
>>> print str.expandtabs(6)
this is     string
>>> print str.expandtabs(7)
this is       string
>>> print str.expandtabs(8)
this is string
>>> print str.expandtabs(9)
this is  string
>>> print str.expandtabs(10)
this is   string
>>> print str.expandtabs(11)
this is    string

So here,

  • 0 removes the tab character entirely,
  • 1 is exactly like the default 8,
  • but 2is exactly like 1 and then
  • 3 is different
  • and then again 4 is like using 1

and after that it increases up till 8 which is the default and then increases after 8.But why the weird pattern in numbers from 0 to 8? I know it is supposed to start from 8, but what is the reason?

解决方案

str.expandtabs(n) is not equivalent to str.replace("\t", " " * n).

str.expandtabs(n) keeps track of the current cursor position on each line, and replaces each tab character it finds with the number of spaces from the current cursor position to the next tab stop. The tab stops are taken to be every n characters.

This is fundamental to the way tabs work, and is not specific to Python. See this answer to a related question for a good explanation of tab stops.

string.expandtabs(n) is equivalent to:

def expandtabs(string, n):
    result = ""
    pos = 0
    for char in string:
        if char == "\t":
            # instead of the tab character, append the
            # number of spaces to the next tab stop
            char = " " * (n - pos % n)
            pos = 0
        elif char == "\n":
            pos = 0
        else:
            pos += 1
        result += char
    return result

And an example of use:

>>> input = "123\t12345\t1234\t1\n12\t1234\t123\t1"
>>> print(expandtabs(input, 10))
123       12345     1234      1
12        1234      123       1

Note how each tab character ("\t") has been replaced with the number of spaces that causes it to line up with the next tab stop. In this case, there is a tab stop every 10 characters because I supplied n=10.

这篇关于Python expandtabs 字符串操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆