文件中最常用的字符串 [英] Most frequently used strings in a file

查看：125 发布时间：2017/11/3 19:56:22 perl file sorting

本文介绍了文件中最常用的字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在这里发现一个帖子是有人设法从一个文件中读取信息，并找出最常用的单词，并返回每个单词的使用次数。输入来自命令行参数，但我想获得相同的脚本来执行，然后将文件名作为输入通过脚本运行。

  print键入文件的名称：; 
 chomp（my $ file =<>）; 
 
打开（FILE，$ file）或死; （< FILE>）{
 $ {$ _ $ ++}用于split / \ W + /; 
 
。 
} 
 
 my $ count = 0; （b 
 
 
 
 
 
 
 
 
 
 
 $ 
 || 
 $ a cmp 
 $ b}键％看过）
 {
 next除非/ \ w /; 
 printf％-20s％5d\\\
，$ seen {$ _}，$ _; 
最后如果++ $ count> 100; 
} 
 close（FILE）;

目前我的结果是：

我想要的结果是：

 < word> <发生次数> 
< word> <发生次数> 
< word> <发生次数> 
< word> <发生次数> 
< word> <发生次数> 
< word> <发生次数>

解决方案

行

  printf％-20s％5d\\\
，$ seen {$ _}，$ _;

与您的意图相反。 $ _ 是一个字符串， $看到{$ _} 是多少次 $ _ 出现在文本中（一个数字），所以你要说

  printf％-20s％5d \\\
，$ _，$ seen {$ _};

或

  printf％5d％-20s\\\
，$ seen {$ _}，$ _;

I found a post here were someone managed to read information from a file and sort out the most commonly used words and return how many times each word was used. The input was from a command line argument but I want to get the same script to be executed and then take the filename to be run through the script as input. I can't find what I'm doing wrong.

print "Type the name of the file: ";
chomp(my $file = <>);

open (FILE, "$file") or die;

while (<FILE>){
    $seen{$_}++ for split /\W+/;
}

my $count = 0;
for (sort {
    $seen{$b} <=> $seen{$a}
              ||
       lc($a) cmp lc($b)
              ||
          $a  cmp  $b
} keys %seen)
{
    next unless /\w/;
    printf "%-20s %5d\n", $seen{$_}, $_;
    last if ++$count > 100;
}
close (FILE);

My result at the moment is:

15                       0
15                       0
10                       0
10                       0
10                       0
5                        1
5                        0
5                        0
5                        0
5                        0

The result I want is:

<word>             <number of occurances>
<word>             <number of occurances>
<word>             <number of occurances>
<word>             <number of occurances>
<word>             <number of occurances>
<word>             <number of occurances>

解决方案

The line

printf "%-20s %5d\n", $seen{$_}, $_;

is the reverse of what you intended. $_ is a string, and $seen{$_} is the count of how many times $_ appears in the text (a number), so you want to say either

printf "%-20s %5d\n", $_, $seen{$_};

printf "%5d %-20s\n", $seen{$_}, $_;

这篇关于文件中最常用的字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

文件中最常用的字符串 [英] Most frequently used strings in a file

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

文件中最常用的字符串 [英] Most frequently used strings in a file

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭