用java 8计算字数 [英] Word count with java 8

查看：157 发布时间：2018/12/17 11:03:05 java java-8 java-stream

本文介绍了用java 8计算字数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在java 8中实现字数统计程序，但我无法使其工作。该方法必须将字符串作为参数并返回 Map< String，Integer> 。

当我用旧java方式进行时，everthing工作正常。但是当我尝试在java 8中执行它时，它返回一个映射，其中键是空的，具有正确的出现次数。

这是我的java 8风格的代码：

  public Map< String，Integer> countJava8（String input）{
 return Pattern.compile（（\\\ +））。splitAsStream（input）.collect（Collectors.groupingBy（e  - > e.toLowerCase（），Collectors.reducing （0，e  - > 1，Integer :: sum）））; 
}

以下是我在正常情况下使用的代码：

 公共地图< String，Integer> count（字符串输入）{
 Map< String，Integer> wordcount = new HashMap<>（）; 
模式编译= Pattern.compile（（（\\\\ +））; 
 Matcher matcher = compile.matcher（输入）; 
 
 while（matcher.find（））{
 String word = matcher.group（）。toLowerCase（）; 
 if（wordcount.containsKey（word））{
 Integer count = wordcount.get（word）; 
 wordcount.put（word，++ count）; 
} else {
 wordcount.put（word.toLowerCase（），1）; 
} 
} 
返回wordcount; 
}

主程序：

  public static void main（String [] args）{
 WordCount wordCount = new WordCount（）; 
 Map< String，Integer> phrase = wordCount.countJava8（一条鱼两条鱼红鱼蓝鱼）; 
 Map< String，Integer> count = wordCount.count（一条鱼两条鱼红鱼蓝鱼）; 
 
 System.out.println（短语）; 
 System.out.println（）; 
 System.out.println（count）; 
}

当我运行这个程序时，我拥有的输出：

  {= 7，= 1} 
 {red = 1，blue = 1，one = 1，fish = 4，two = 1}

我认为方法 splitAsStream 会将正则表达式中的匹配元素流式传输为 Stream 。我该如何纠正？

解决方案

问题似乎是你实际上用语言拆分，即你正在流式传输在不一个单词的所有内容中，或在单词之间。不幸的是，似乎没有相同的方法来流式传输实际的匹配结果（很难相信，但我没有找到;如果你知道的话可以随意发表评论）。

<相反，您可以使用 \W 而不是 \w 来拆分非单词。另外，如注释中所述，通过使用 String :: toLowerCase 而不是lambda和<$ c $，可以使有点更具可读性c> Collectors.summingInt 。

  public static Map< String，Integer> countJava8（String input）{
 return Pattern.compile（\\W +）
 .splitAsStream（input）
 .collect（Collectors.groupingBy（String :: toLowerCase，$ b） $ b Collectors.summingInt（s  - > 1）））; 
}

但恕我直言这仍然很难理解，不仅仅是因为逆查找，并且很难推广到其他更复杂的模式。就个人而言，我会选择旧学校解决方案，也许使用新的 getOrDefault 。

  public static Map< String，Integer> countOldschool（字符串输入）{
 Map< String，Integer> wordcount = new HashMap<>（）; 
 Matcher matcher = Pattern.compile（\\\\ + +）。matcher（输入）; 
 while（matcher.find（））{
 String word = matcher.group（）。toLowerCase（）; 
 wordcount.put（word，wordcount.getOrDefault（word，0）+ 1）; 
} 
返回wordcount; 
}

两种情况下的结果似乎相同。

I am trying to implement a word count program in java 8 but I am unable to make it work. The method must take a string as parameter and returns a Map<String,Integer>.

When I am doing it in old java way, everthing works fine. But when I am trying to do it in java 8, it returns a map where the keys are the empty with the correct occurrences.

Here is my code in a java 8 style :

public Map<String, Integer> countJava8(String input){
       return Pattern.compile("(\\w+)").splitAsStream(input).collect(Collectors.groupingBy(e -> e.toLowerCase(), Collectors.reducing(0, e -> 1, Integer::sum)));
    }

Here is the code I would use in a normal situation :

public Map<String, Integer> count(String input){
        Map<String, Integer> wordcount = new HashMap<>();
        Pattern compile = Pattern.compile("(\\w+)");
        Matcher matcher = compile.matcher(input);

        while(matcher.find()){
            String word = matcher.group().toLowerCase();
            if(wordcount.containsKey(word)){
                Integer count = wordcount.get(word);
                wordcount.put(word, ++count);
            } else {
                wordcount.put(word.toLowerCase(), 1);
            }
        }
        return wordcount;
 }

The main program :

public static void main(String[] args) {
       WordCount wordCount = new WordCount();
       Map<String, Integer> phrase = wordCount.countJava8("one fish two fish red fish blue fish");
       Map<String, Integer> count = wordCount.count("one fish two fish red fish blue fish");

        System.out.println(phrase);
        System.out.println();
        System.out.println(count);
    }

When I run this program, the outputs that I have :

{ =7, =1}
{red=1, blue=1, one=1, fish=4, two=1}

I thought that the method splitAsStream would stream the matching elements in the regex as Stream. How can I correct that?

解决方案

The problem seems to be that you are in fact splitting by words, i.e. you are streaming over everything that is not a word, or that is in between words. Unfortunately, there seems to be no equivalent method for streaming the actual match results (hard to believe, but I did not find any; feel free to comment if you know one).

Instead, you could just split by non-words, using \W instead of \w. Also, as noted in comments, you can make it a bit more readable by using String::toLowerCase instead of a lambda and Collectors.summingInt.

public static Map<String, Integer> countJava8(String input) {
    return Pattern.compile("\\W+")
                  .splitAsStream(input)
                  .collect(Collectors.groupingBy(String::toLowerCase,
                                                 Collectors.summingInt(s -> 1)));
}

But IMHO this is still very hard to comprehend, not only because of the "inverse" lookup, and it's also difficult to generalize to other, more complex patterns. Personally, I would just go with the "old school" solution, maybe making it a bit more compact using the new getOrDefault.

public static Map<String, Integer> countOldschool(String input) {
    Map<String, Integer> wordcount = new HashMap<>();
    Matcher matcher = Pattern.compile("\\w+").matcher(input);
    while (matcher.find()) {
        String word = matcher.group().toLowerCase();
        wordcount.put(word, wordcount.getOrDefault(word, 0) + 1);
    }
    return wordcount;
}

The result seems to be the same in both cases.

这篇关于用java 8计算字数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用java 8计算字数 [英] Word count with java 8

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

用java 8计算字数 [英] Word count with java 8

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭