快速字符串替换 [英] Fast string replace

查看:26
本文介绍了快速字符串替换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在构建一个可能非常大的字符串后,我将把其中的单个字符(或字节,如有必要)更改为另一个字符.

After building up a potentially very large string, I'm going to do a lot of changing single characters in it (or bytes, if necessary), to another char.

实际上,我的脚本正在构建一个填字游戏,所以字符串不会很长,但我的问题很一般:

Actually, my script is building a crossword puzzle, so the string won't be very long, but my question is general:

如何利用我没有改变字符串(或任何更好的数据类型)长度这一事实来加快速度?

How can I use the fact that I'm not altering the strings (or whatever data type is better) length, to speed things up?

我想我正在寻找的部分内容是发送指向字符串的指针或引用的方法,或者在 Tcl 的情况下发送变量名.

I guess part of what I'm looking for is a way to send a pointer or reference to the string, or in Tcl's case the variable name.

我的另一个问题是 C 代码内部发生了什么.

My other question is what happens internally in the C code.

这个调用会复制整个字符串零次、一次甚至两次吗?

Will this call copy the entire string zero, one or even two times?

set index [expr {$row * $width + $col}]
set puzzle [string replace $puzzle $index $index "E"]

推荐答案

string replace 操作将在满足两个条件的情况下进行就地更改:

The string replace operation will do an in-place change provided two conditions are satisfied:

  1. 插入的字符串必须与被删除的字符串长度相同.我想这对你来说是显而易见的.
  2. 该字符串必须位于非共享引用中,以便其他任何事物都无法观察到正在修改的值.(这是所有 Tcl 引用如何工作的关键部分;共享引用不能就地修改.)

该调用,如所写,将复制.基于对字符串的引用处理的简单检查,这是可以预测的;问题是旧版本的字符串保留在 puzzle 中,直到 string replace 完成(set 需要结果才能工作).为了解决这个问题,我们做了一个有点奇怪的事情:

That call, as written, will copy. This is predictable based on simple examination of the reference handling for the string; the issue is that the old version of the string remains in puzzle until after the string replace completes (the set needs the result to work). To fix that, we do this slightly strange thing:

set puzzle [string replace $puzzle[set puzzle {}] $index $index "E"]

是的,这很奇怪,但它运行良好,因为与已知空字符串的连接是一种显式优化的情况,假设您在这里处理未跟踪的变量.(它适用于跟踪变量,但双重写入是可观察到的,跟踪可能会做一些棘手的事情,因此您会失去优化机会.)

Yes, this is weird but it works well because concatenation with a known-empty string is an explicitly optimised case, assuming you're dealing with untraced variables here. (It'll work with traced variables, but the double write is observable and traces could do tricky things so you lose optimisation opportunities.)

如果您正在进行大量更改,有时会改变事物的长度,切换到使用列表和 lset 会更有效.对列表的等效操作都使用相同的通用引用和就地语义,但适用于列表元素而不是字符.

If you were doing extensive changes that sometimes change the length of things, switching to using lists and lset would be more efficient. The equivalent operations on lists all use the same general reference and in-place semantics, but work on list elements instead of characters.

我正在谈论的优化是在 strcat 操作码中,而 strreplace 知道在可以时就地执行,但您看不到信息字节码级别;几乎所有的操作都知道这一点.

The optimisation I'm talking about is in the strcat opcode, and strreplace knows to do in-place when it can but you don't see the information at the bytecode level; virtually all operations know that.

% tcl::unsupported::disassemble lambda {{puzzle index} {
    set puzzle [string replace $puzzle[set puzzle {}] $index $index "E"]
}}
ByteCode 0x0x7fbff6021c10, refCt 1, epoch 17, interp 0x0x7fbff481e010 (epoch 17)
  Source "\n    set puzzle [string replace $puzzle[set puzzle {}]..."
  Cmds 3, src 74, inst 18, litObjs 2, aux 0, stkDepth 4, code/src 0.00
  Proc 0x0x7fbff601cc90, refCt 1, args 2, compiled locals 2
      slot 0, scalar, arg, "puzzle"
      slot 1, scalar, arg, "index"
  Commands 3:
      1: pc 0-16, src 5-72        2: pc 0-14, src 17-71
      3: pc 2-5, src 40-52
  Command 1: "set puzzle [string replace $puzzle[set puzzle {}] $inde..."
  Command 2: "string replace $puzzle[set puzzle {}] $index $index \"E..."
    (0) loadScalar1 %v0     # var "puzzle"
  Command 3: "set puzzle {}..."
    (2) push1 0     # ""
    (4) storeScalar1 %v0    # var "puzzle"
    (6) strcat 2 
    (8) loadScalar1 %v1     # var "index"
    (10) loadScalar1 %v1    # var "index"
    (12) push1 1    # "E"
    (14) strreplace 
    (15) storeScalar1 %v0   # var "puzzle"
    (17) done 

这篇关于快速字符串替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆