BASH在.txt文件中查找回文 [英] BASH Finding palindromes in a .txt file

查看:92
本文介绍了BASH在.txt文件中查找回文的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我得到了一个.txt文件,其中我们必须查找文本中的所有回文(必须至少包含3个字母,并且它们不能是相同的字母,例如AAA)

I have been given a .txt file in which we have to find all the palindromes in the text (must have at least 3 letters and they cant be the same letters e.g. AAA)

它应该显示,第一列是它出现的次数,第二列是单词,例如

it should be displayed with the first column being the amount of times it appears and the second being the word e.g.

123皮划艇

123 kayak

3鲍勃

1爸爸

#!/bin/bash

tmp='mktemp'

awk '{for(x=1;$x;++x)print $x}' "${1}" | tr -d [[:punct:]] | tr -s [:space:] | sed -e 's/@//g' -e 's/[0-9]*//g'| sed -r '/^.{,2}$/d' | sort | uniq -c -i > tmp1

这将按原样输出文件,忽略大小写,少于3个字母的单词,标点和数字.

This outputs the file as it should do, ignoring case, words less than 3 letters, punctuation and digits.

但是我现在对如何从中消除回文症感到困惑,我认为可能是临时文件的一种方式,只是不知道将其保存在何处.

However i am now stump on how to pull out the palindromes from this, i thought a temp file might be the way, just don't know where to take it.

非常感谢任何帮助或指导.

any help or guidance is much appreciated.

推荐答案

# modify this to your needs; it should take your input on stdin, and return one word per
# line on stdout, in the same order if called more than once with the same input.
preprocess() {
  tr -d '[[:punct:][:digit:]@]' \
    | sed -E -e '/^(.)\1+$/d' \
    | tr -s '[[:space:]]' \
    | tr '[[:space:]]' '\n'
}

paste <(preprocess <"$1") <(preprocess <"$1" | rev) \
  | awk '$1 == $2 && (length($1) >= 3) { print $1 }' \
  | sort | uniq -c

这里的关键是将输入文件与一个流粘贴在一起,该流将输入文件中的每一行都颠倒过来.这为您提供了两个可以比较的独立列.

The critical thing here is to paste together your input file with a stream that has each line from that input file reversed. This gives you two separate columns you can compare.

这篇关于BASH在.txt文件中查找回文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆