使用流来操作String [英] Using streams to manipulate a String

查看:168
本文介绍了使用流来操作String的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我要删除 String 中的所有非字母。

Let's say that I want to remove all the non-letters from my String.

String s = "abc-de3-2fg";

我可以使用 IntStream 以便这样做:

I can use an IntStream in order to do that:

s.stream().filter(ch -> Character.isLetter(ch)).  // But then what?

我该怎么做才能将此流转换回 String instance?

What can I do in order to convert this stream back to a String instance?

另外,为什么我不能将 String 视为类型字符的对象流?

On a different note, why can't I treat a String as a stream of objects of type Character?

String s = "abc-de3-2fg";

// Yields a Stream of char[], therefore doesn't compile
Stream<Character> stream = Stream.of(s.toCharArray());

// Yields a stream with one member - s, which is a String object. Doesn't compile
Stream<Character> stream = Stream.of(s);

根据javadoc, Stream '创建签名如下:

According to the javadoc, the Stream's creation signature is as follows:


Stream.of(T ... values)

Stream.of(T... values)

我能想到的唯一(糟糕)方式是:

The only (lousy) way that I could think of is:

String s = "abc-de3-2fg";
Stream<Character> stream = Stream.of(s.charAt(0), s.charAt(1), s.charAt(2), ...)

当然,这还不够好......我错过了什么?

And of course, this isn't good enough... What am I missing?

推荐答案

这是问题第二部分的答案。如果您通过调用 string.chars()得到 IntStream ,则可以获得 Stream< ;字符> 通过转换为 char 然后通过调用 mapToObj 来装箱结果。例如,以下是如何将 String 转换为 Set< Character>

Here's an answer the second part of the question. If you have an IntStream resulting from calling string.chars() you can get a Stream<Character> by casting to char and then boxing the result by calling mapToObj. For example, here's how to turn a String into a Set<Character>:

Set<Character> set = string.chars()
    .mapToObj(ch -> (char)ch)
    .collect(Collectors.toSet());

请注意,转换为 char 对于盒装结果为字符而不是整数

Note that casting to char is essential for the boxed result to be Character instead of Integer.

现在处理 char 字符数据的一个大问题是补充字符表示为代理项对 char 值,因此任何处理单个 char 值的算法在提供补充时可能会失败字符。

Now the big problem with dealing with char or Character data is that supplementary characters are represented as surrogate pairs of char values, so any algorithm with deals with individual char values will probably fail when presented with supplementary characters.

(看起来补充字符是一个模糊的Unicode功能,我们不需要担心,但据我所知,所有的表情符号都是补充的字符。)

(It may seem like supplementary characters are an obscure Unicode feature that we don't need to worry about, but as far as I know, all emoji are supplementary characters.)

考虑这个例子:

string.chars()
      .filter(Character::isAlphabetic)
      ...

这如果出现包含代码点U + 1D400(数学Bold Cap)的字符串,将失败 ital A)。该代码点表示为字符串中的代理项对,并且代理项对的值都不是字母字符。要获得正确的结果,您需要这样做:

This will fail if presented with a string that contains the code point U+1D400 (Mathematical Bold Capital A). That code point is represented as a surrogate pair in the string, and neither value of a surrogate pair is an alphabetic character. To get the correct result, you'd need to do this instead:

string.codePoints()
      .filter(Character::isAlphabetic)
      ...

我建议总是使用 codePoints()

现在,给定 IntStream 代码点,怎么能把它重组成一个字符串? Sleiman Jneidi的答案是合理的(+1),使用三arg collect() IntStream的方法

Now, given an IntStream of code points, how can one reassemble it into a String? Sleiman Jneidi's answer is a reasonable one (+1), using the three-arg collect() method of IntStream.

这是另一种选择:

StringBuilder sb = ... ;
string.codePoints()
      .filter(...)
      .forEachOrdered(sb::appendCodePoint);
return sb.toString();

如果您已经拥有,这可能会更灵活一些StringBuilder 您用来累积字符串数据。您不必每次都创建一个新的 StringBuilder ,也不必在之后将其转换为 String

This might be a bit more flexible, in cases where you already have a StringBuilder that you're using to accumulate string data. You don't have to create a new StringBuilder each time, nor do you have to convert it to a String afterwards.

这篇关于使用流来操作String的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆