如何正确地将流减少到另一个流 [英] How correctly reduce stream to another stream

查看:72
本文介绍了如何正确地将流减少到另一个流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有字符串流和空值,如

I've stream of strings and nulls like

Stream<String> str1 = Stream.of("A","B","C",null,null,"D",null,"E","F",null,"G",null);

我想将它减少到另一个流,其中任何非空字符串序列连接在一起,即像

I want to reduce it to another stream, where any sequence of not null string joined together, ie like

Stream<String> str2 = Stream.of("ABC", "", "D", "EF","G")

第一种方式,我发现 - 创建收集器,首先将完整的输入流减少到单个对象,并列出所有连接的字符串,然后从中创建新流:

First way, that i found - create collector that firstly reduce complete input stream to single object with list of all joined strings and then create new stream from it:

class Acc1 {
  final private List<String> data = new ArrayList<>();
  final private StringBuilder sb = new StringBuilder();

  private void accept(final String s) {
    if (s != null) 
      sb.append(s);
    else {
      data.add(sb.toString());
      sb.setLength(0);
    }
  }

  public static Collector<String,Acc1,Stream<String>> collector() {
    return Collector.of(Acc1::new, Acc1::accept, (a,b)-> a, acc -> acc.data.stream());
  }
}
...
Stream<String> str2 = str.collect(Acc1.collector());

但是在这种情况下如果使用str2之前,即使是str2.findFirst(),输入流也会完全处理。时间和内存消耗操作以及来自某些生成器的无限流它根本不起作用

But in this case before any use if str2, even as str2.findFirst(), input stream will be completely processed. It time and memory consuming operation and on infinity stream from some generator it will not work at all

另一种方式 - 创建将保持中间状态并在flatMap中使用它的外部对象():

Another way - create external object that will keep intermediate state and use it in flatMap():

class Acc2 {
  final private StringBuilder sb = new StringBuilder();

  Stream<String> accept(final String s) {
    if (s != null) {
      sb.append(s);
      return Stream.empty();
    } else {
      final String result = sb.toString();
      sb.setLength(0);
      return Stream.of(result);
    }
  }
}
...
Acc2 acc = new Acc2();
Stream<String> str2 = str1.flatMap(acc::accept);

在这种情况下,str1将只检索通过str2真正访问过的元素。

In this case from str1 will be retrieved only elemets that really accessed via str2.

但是使用在流处理之外创建的外部对象对我来说看起来很难看,可能会导致一些副作用,我现在还没有看到。此外,如果str2稍后将与parallelStream()一起使用,则会导致不可预测的结果。

But using of external object, created outside of stream processing, looks ugly for me and probably can cause some side effects, that i do not see now. Also if str2 will be used later with parallelStream() it will cause unpredictable result.

在没有这些缺陷的情况下还有更正确的stream-> stream reduction的实现吗?

Is there any more correct implemetation of stream->stream reduction without these flaws?

推荐答案

缩减或其可变变体 collect 始终是一个处理所有项目的操作。您的操作可以通过自定义 <$ c $来实现。 c> Spliterator ,例如

Reduction or its mutable variant, collect, is always an operation that will process all items. Your operation can be implemented via a custom Spliterator, e.g.

public static Stream<String> joinGroups(Stream<String> s) {
    Spliterator<String> sp=s.spliterator();
    return StreamSupport.stream(
        new Spliterators.AbstractSpliterator<String>(sp.estimateSize(), 
        sp.characteristics()&Spliterator.ORDERED | Spliterator.NONNULL) {
            private StringBuilder sb = new StringBuilder();
            private String last;

            public boolean tryAdvance(Consumer<? super String> action) {
                if(!sp.tryAdvance(str -> last=str))
                    return false;
                while(last!=null) {
                    sb.append(last);
                    if(!sp.tryAdvance(str -> last=str)) break;
                }
                action.accept(sb.toString());
                sb=new StringBuilder();
                return true;
            }
        }, false);
}

生成预期的组,因为您可以使用

which produces the intended groups, as you can test with

joinGroups(Stream.of("A","B","C",null,null,"D",null,"E","F",null,"G",null))
    .forEach(System.out::println);

但也有所需的懒惰行为,可通过

but also has the desired lazy behavior, testable via

joinGroups(
    Stream.of("A","B","C",null,null,"D",null,"E","F",null,"G",null)
          .peek(str -> System.out.println("consumed "+str))
).skip(1).filter(s->!s.isEmpty()).findFirst().ifPresent(System.out::println);






经过一番思考,我稍微来到了这里更高效的变体。只有当至少有两个 String 要加入时,它才会包含 StringBuilder ,否则它只会使用已经存在的唯一 String 实例或空文字组的文字字符串:


After a second thought, I came to this slightly more efficient variant. It will incorporate the StringBuilder only if there are at least two Strings to join, otherwise, it will simply use the already existing sole String instance or the literal "" string for empty groups:

public static Stream<String> joinGroups(Stream<String> s) {
    Spliterator<String> sp=s.spliterator();
    return StreamSupport.stream(
        new Spliterators.AbstractSpliterator<String>(sp.estimateSize(), 
        sp.characteristics()&Spliterator.ORDERED | Spliterator.NONNULL) {
            private String next;

            public boolean tryAdvance(Consumer<? super String> action) {
                if(!sp.tryAdvance(str -> next=str))
                    return false;
                String string=next;
                if(string==null) string="";
                else if(sp.tryAdvance(str -> next=str) && next!=null) {
                    StringBuilder sb=new StringBuilder().append(string);
                    do sb.append(next);while(sp.tryAdvance(str -> next=str) && next!=null);
                    string=sb.toString();
                }
                action.accept(string);
                return true;
            }
        }, false);
}

这篇关于如何正确地将流减少到另一个流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆