使用bash计数文件中每个单词的出现 [英] use bash count every word's occurrence in a file

查看：46 发布时间：2021/4/14 20:28:19 arrays bash

本文介绍了使用bash计数文件中每个单词的出现的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想计算文件中每个单词的出现但是结果是错误的.

i want to count every word's occurrence in a file but the result is wrong.

#!/bin/bash
#usage: count.sh file

declare -a dict

for word in $(cat $1)
do
    if [ ${dict[$word]} == "" ] ;then
        dict[$word]=0
    else
        dict[$word]=$[${dict[$word]} + 1]
    fi
done

for word in ${!dict[@]}
do
    echo $word: ${dict[$word]}
done

使用下面的测试文件:

learning the bash shell
this is second line
this is the last line

bash -x count.sh文件得到结果:

bash -x count.sh file get the result:

+ declare -a dict
++ cat book
+ for word in '$(cat $1)'
+ '[' '' == '' ']'
+ dict[$word]=0
+ for word in '$(cat $1)'
+ '[' 0 == '' ']'
+ dict[$word]=1
+ for word in '$(cat $1)'
+ '[' 1 == '' ']'
+ dict[$word]=2
+ for word in '$(cat $1)'
+ '[' 2 == '' ']'
+ dict[$word]=3
+ for word in '$(cat $1)'
+ '[' 3 == '' ']'
+ dict[$word]=4
+ for word in '$(cat $1)'
+ '[' 4 == '' ']'
+ dict[$word]=5
+ for word in '$(cat $1)'
+ '[' 5 == '' ']'
+ dict[$word]=6
+ for word in '$(cat $1)'
+ '[' 6 == '' ']'
+ dict[$word]=7
+ for word in '$(cat $1)'
+ '[' 7 == '' ']'
+ dict[$word]=8
+ for word in '$(cat $1)'
+ '[' 8 == '' ']'
+ dict[$word]=9
+ for word in '$(cat $1)'
+ '[' 9 == '' ']'
+ dict[$word]=10
+ for word in '$(cat $1)'
+ '[' 10 == '' ']'
+ dict[$word]=11
+ for word in '$(cat $1)'
+ '[' 11 == '' ']'
+ dict[$word]=12
+ for word in '${!dict[@]}'
+ echo 0: 12 0: 12

推荐答案

使用 declare -a dict 意味着每个键都被评估为数字值，然后将其用作索引.如果您要用文字存储东西，那不是您想要的.改用 declare -A .

Using declare -a dict means that each key is being evaluated to a numeric value, which is then used as an index. That's not what you want, if you're storing things by words. Use declare -A instead.

此外， $ [] 是一种过时的数学语法.甚至现代的POSIX sh都支持 $((())"，您应该改用它:

Also, $[ ] is an exceedingly outdated syntax for math. Even modern POSIX sh supports $(( )), which you should use instead:

dict[$word]=$(( ${dict[$word]} + 1 ))

或者，要利用仅bash的数学语法:

or, to take advantage of bash-only math syntax:

(( dict[$word]++ ))

另外，在$(cat $ 1)中的单词中使用表示单词有几种破损方式:

它不引用 $ 1 ，因此对于带有空格的文件名，它将名称拆分成几个单词，并尝试将每个单词作为一个单独的文件打开.要解决此问题，您可以使用 $(cat"$ 1")或 $(<"$ 1")(效率更高，因为它不需要启动外部程序cat).
它会尝试将文件中的单词扩展为glob-如果文件包含 * ，则当前目录中的每个文件都将被视为一个单词.

It doesn't quote $1, so for a filename with spaces, it will split the name into several words and try to open each word as a separate file. To fix only this, you would use $(cat "$1") or $(<"$1") (which is more efficient, as it doesn't require starting the external program cat).
It tries to expand the words in the file as globs -- if the file contains *, every file in the current directory will be treated as a word.

相反，使用while循环:

Instead, use a while loop:

while read -r -d' ' word; do
  if [[ -n ${dict[$word]} ]] ; then
    dict[$word]=$(( ${dict[$word]} + 1 ))
  else
    dict[$word]=1
  fi
done <"$1"

这篇关于使用bash计数文件中每个单词的出现的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用bash计数文件中每个单词的出现 [英] use bash count every word's occurrence in a file

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用bash计数文件中每个单词的出现 [英] use bash count every word&#39;s occurrence in a file

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

使用bash计数文件中每个单词的出现 [英] use bash count every word's occurrence in a file

登录关闭