满足某些条件时,如何在行后附加字符串? [英] How can I append a string to a line when certain conditions are met?

查看:116
本文介绍了满足某些条件时,如何在行后附加字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理大型.txt文件,我们正在尝试识别哪些行中的字符数与正确的字符数不符(顶部80个字符).

I'm handling large .txt files and we are trying to identify which ones do not comply with the correct amount of characters in a line (80 characters top).

为方便起见,假设每行需要10个字符,对于每行不需添加的行,我需要附加(+多余字符数)"和(-丢失字符数)"恰好10个字符.

For the sake of this example let's say that we need 10 characters for every line, I need to append "(+Number of extra characters)" and "(-Number of missing characters)" for each line that does not have exactly 10 characters.

这是我到目前为止所拥有的:

Here is what I have so far:

while IFS='' read -r line || [[ -n "$line" ]]; do
  if [[ "${#line}" -gt 10 ]]; then
    echo "Mo dan 10 D: ${#line}"
  elif [[ "${#line}" -lt 10 ]]; then
    echo "Less dan 10 D: ${#line}"
  fi

done < "$1"

我一直在寻找一种方法来添加我在相应行中回显的这两个字符串,以便我们可以识别它们.

I'm stuck in finding a way to append those two strings I'm echoing in the corresponding line so we can identify them.

我研究了awk和sed,但是无法正确遍历整个.txt文件,无法计算每行中的字符数,并在字符串后附加适当的消息.

I researched about awk and sed but haven't been able to properly loop through the entire .txt file, count the amount of characters in every line and append a string with the appropriate message.

在shell脚本编写或awk或sed解决方案方面将提供一些帮助,我们将不胜感激. 谢谢.

Would appreciate some assistance in either shell scripting or as an awk or sed solution. Thank You.

这是一个示例输入文件(请注意,空格也算作字符)

This is an example input file (note white spaces also count as characters)

Line 1****
Line 2*****
Line 3*
Line 4****
Line 5****
Line 6**
Line 7****
Line 8********
Line 9****

这是所需的输出

Line 1****
Line 2*****(+1)
Line 3*(-3)
Line 4****
Line 5****
Line 6**(-2)
Line 7****
Line 8********(+4)
Line 9****

推荐答案

出于性能原因,使用 shell循环处理文件的行是错误的方法 (除非文件很小).

For performance reasons, using a shell loop to process the lines of a file is the wrong approach (unless the file is very small).

文本处理实用程序(例如awk)是更好的选择:

A text-processing utility such as awk is the much better choice:

awk -v targetLen=10 '
  diff = length($0) - targetLen { # input line ($0) does not have the expected length
    $0 = $0 "(" (diff > 0 ? "+" : "") diff ")" # append diff (with +, if positive)
  }
  1  # Print the (possibly modified) line.
' <<'EOF'  # sample input as a here-document
1234567890
123456789
123456789012
EOF

这将产生:

1234567890
123456789(-1)
123456789012(+2)

注意事项:BSD/macOS awk实现不支持区域设置,因此其length函数计数 bytes ,仅按预期使用ASCII范围字符.

Caveat: The BSD/macOS awk implementation is not locale-aware, so its length function counts bytes, which will only work as intended with ASCII-range characters.

这篇关于满足某些条件时,如何在行后附加字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆