如何交换字符串中的两个字符? [英] How to swap two characters in a string?

查看:59
本文介绍了如何交换字符串中的两个字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想写一个函数如下:

  • 输入:字符串 A, int i, 0 <我<len(A)
  • 输出:字符串 A 的 (i - 1) 处的字符与 i 处的字符交换.

什么是能够实现这一目标的干净解决方案?我目前的解决方案是:

let mut swapped = input_str[0..i].to_string();swapped.push(input_str.char_at(i));swapped.push(input_str.char_at(i - 1));swapped.push_str(&query[i..input_str.len()]);

但这仅适用于 ASCII 字符串.我可以将其他解决方案视为转换为 UTF-32 中的向量,在那里交换并转换回字符串,但这似乎需要很多额外的工作.

解决方案

这是一个很好的解决方案:

使用 std::str::CharRange;fn swap_chars_at(input_str: &str, i: usize) ->细绳 {//预先分配一个正确大小的字符串让 mut 交换 = String::with_capacity(input_str.len());//提取前一个字符让 CharRange { ch: prev_ch, next: prev } = input_str.char_range_at_reverse(i);//提取当前字符让 CharRange { ch, next } = input_str.char_range_at(i);//把它们放回去swapped.push_str(&input_str[..prev]);交换.push(ch);交换.推(prev_ch);swapped.push_str(&input_str[next..]);//完毕!交换}#[测试]fn 烟雾测试(){让 s = swap_chars_at("lyra", 2);assert_eq!(s, "lrya");}#[测试]fn Unicode() {//'ç' 在 UTF-8 中占用 2 个字节让 s = swap_chars_at("ça va?", 2);assert_eq!(s, "aç va?");}

来自文档:>

  • fn char_range_at(&self, start: usize) ->字符范围
    • 从字符串中提取一个字符并返回下一个字符的索引.
  • fn char_range_at_reverse(&self, start: usize) ->字符范围
    • 给定一个字节位置和一个 str,返回前一个字符及其位置.

这两种方法一起让我们可以在字符串中前后查看——这正是我们想要的.

<小时>

等等,还有更多!DK用上面的代码指出了一个角落案例.如果输入包含任何组合字符,它们可能会与他们结合的字符.

现在,这个问题是关于 Rust,而不是 Unicode,所以我不会详细介绍 它究竟是如何工作的.现在你需要知道的是 Rust 提供了 这个方法:

  • fn grapheme_indices(&self, is_extended: bool) ->字素索引
    • 在 self 及其字节偏移量的字形簇上返回迭代器.

通过 .find().rev() 的健康应用,我们得到了这个(希望)正确的解决方案:

#![allow(unstable)]//`GraphemeIndices` 不稳定fn swap_graphemes_at(input_str: &str, i: usize) ->细绳 {//预先分配一个正确大小的字符串让 mut 交换 = String::with_capacity(input_str.len());//找到索引 i 处的字素让 (_, gr) = input_str.grapheme_indices(true).find(|&(index, _)| index == i).expect("索引没有指向有效的字素");//找到它之前的字素让 (prev, prev_gr) = input_str.grapheme_indices(true).rev().find(|&(index, _)| index 

诚然,这有点令人费解.首先,它遍历输入,从 i 中取出字素簇.然后它向后迭代(.rev()) 通过输入,选择索引<的最右边的簇i(即前一个集群).最后,它把所有东西重新组合在一起.

如果您真的迂腐,还有更多特殊情况需要处理.例如,如果字符串包含 Windows 换行符 ("\r\n"),那么我们可能不想交换它们.在希腊语中,字母 sigma (σ) 在单词 (ς) 结尾时的书写方式有所不同,因此更好的算法应该根据需要在它们之间进行翻译.并且不要忘记那些双向控制字符...

但是为了我们的理智,我们就到此为止吧.

I want to write a function as follows:

  • Input: String A, int i, 0 < i < len(A)
  • Output: String A with character at (i - 1) swapped with character at i.

What is a clean solution that will achieve this? My current solution is:

let mut swapped = input_str[0..i].to_string();
swapped.push(input_str.char_at(i));
swapped.push(input_str.char_at(i - 1));
swapped.push_str(&query[i..input_str.len()]);

But that only works for ASCII strings. I can think of other solutions as converting to a vector in UTF-32, swapping there and converting back to a string, but it seems like a lot of extra work.

解决方案

Here's a pretty solution:

use std::str::CharRange;

fn swap_chars_at(input_str: &str, i: usize) -> String {
    // Pre-allocate a string of the correct size
    let mut swapped = String::with_capacity(input_str.len());
    // Pluck the previous character
    let CharRange { ch: prev_ch, next: prev } = input_str.char_range_at_reverse(i);
    // Pluck the current character
    let CharRange { ch, next } = input_str.char_range_at(i);
    // Put them back
    swapped.push_str(&input_str[..prev]);
    swapped.push(ch);
    swapped.push(prev_ch);
    swapped.push_str(&input_str[next..]);
    // Done!
    swapped
}

#[test]
fn smoke_test() {
    let s = swap_chars_at("lyra", 2);
    assert_eq!(s, "lrya");
}

#[test]
fn unicode() {
    // 'ç' takes up 2 bytes in UTF-8
    let s = swap_chars_at("ça va?", 2);
    assert_eq!(s, "aç va?");
}

From the documentation:

  • fn char_range_at(&self, start: usize) -> CharRange
    • Pluck a character out of a string and return the index of the next character.
  • fn char_range_at_reverse(&self, start: usize) -> CharRange
    • Given a byte position and a str, return the previous char and its position.

Together, these two methods let us peek backwards and forwards in the string—which is exactly what we want.


But wait, there's more! DK pointed out a corner case with the above code. If the input contains any combining characters, they may become separated from the characters they combine with.

Now, this question is about Rust, not Unicode, so I won't go into the details of how exactly that works. All you need to know for now is that Rust provides this method:

  • fn grapheme_indices(&self, is_extended: bool) -> GraphemeIndices

With a healthy application of .find() and .rev(), we arrive at this (hopefully) correct solution:

#![allow(unstable)]  // `GraphemeIndices` is unstable

fn swap_graphemes_at(input_str: &str, i: usize) -> String {
    // Pre-allocate a string of the correct size
    let mut swapped = String::with_capacity(input_str.len());
    // Find the grapheme at index i
    let (_, gr) = input_str.grapheme_indices(true)
        .find(|&(index, _)| index == i)
        .expect("index does not point to a valid grapheme");
    // Find the grapheme just before it
    let (prev, prev_gr) = input_str.grapheme_indices(true).rev()
        .find(|&(index, _)| index < i)
        .expect("no graphemes to swap with");
    // Put it all back together
    swapped.push_str(&input_str[..prev]);
    swapped.push_str(gr);
    swapped.push_str(prev_gr);
    swapped.push_str(&input_str[i+gr.len()..]);
    // Done!
    swapped
}

#[test]
fn combining() {
    // Ensure that "c\u{327}" is treated as a single unit
    let s = swap_graphemes_at("c\u{327}a va?", 3);
    assert_eq!(s, "ac\u{327} va?");
}

Admittedly it's a bit convoluted. First it iterates through the input, plucking out the grapheme cluster at i. Then it iterates backward (.rev()) through the input, picking the rightmost cluster with index < i (i.e. the previous cluster). Finally it goes and puts everything back together.

If you're being really pedantic, there are still more special cases to deal with. For example, if the string contains Windows newlines ("\r\n"), then we probably don't want to swap them around. And in Greek, the letter sigma (σ) is written differently when it's at the end of a word (ς), so a better algorithm should translate between them as necessary. And don't forget those bidirectional control characters...

But for the sake of our sanity, we'll stop here.

这篇关于如何交换字符串中的两个字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆