使用变量的bash脚本grep无法找到实际存在的结果 [英] bash script grep using variable fails to find result that actually does exist

查看：61 发布时间：2021/12/24 12:30:18 bash curl sed grep carriage-return

本文介绍了使用变量的bash脚本grep无法找到实际存在的结果的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个 bash 脚本，它遍历链接列表，每个链接卷曲一个 html 页面，greps 用于特定的字符串格式(语法是:CVE-####-####)，删除周围的html 标签(这是一种一致的格式，不需要特殊情况处理)，在更改日志文件中搜索结果字符串 ID，最后根据是否找到字符串 ID 执行操作.

I have a bash script that iterates over a list of links, curl's down an html page per link, greps for a particular string format (syntax is: CVE-####-####), removes the surrounding html tags (this is a consistent format, no special case handling necessary), searches a changelog file for the resulting string ID, and finally does stuff based on whether the string ID was found or not.

找到的字符串 ID 被设置为变量.问题是，当对变量进行 grep 时，没有结果，即使我肯定知道某些 ID 应该有结果.这是脚本的相关部分:

The found string ID is set as a variable. The issue is that when grepping for the variable there are no results, even though I positively know there should be for some of the ID's. Here is the relevant portion of the script:

for link in $(cat links.txt); do
    curl -s "$link" | grep 'CVE-' | sed 's/<[^>]*>//g' | while read cve; do
        echo "$cve"
        grep "$cve" ./changelog.txt
    done
done

如果我在 grep 命令中对已知 ID 进行硬编码，脚本会找到该 ID 并按预期返回内容.我已经在这个变量上尝试了许多 grepping 的变体(例如，导出它并进行命令扩展，将更改日志和管道连接到 grep，通过 curl 链的命令扩展直接设置变量，围绕变量的单引号和双引号，半个其他十几种东西).

If I hardcode a known ID in the grep command, the script finds the ID and returns things as expected. I've tried many variations of grepping on this variable (e.g. exporting it and doing command expansion, cat'ing the changelog and piping to grep, setting variable directly via command expansion of the curl chain, single and double quotes surrounding variables, half a dozen other things).

我是否遗漏了 curl 输出变量的细微差别?grep |sed 链?当它被回显到标准输出或 >> 到文件时，事情看起来很好(没有奇数字符或回车等的单个 ID).

Am I missing something nuanced with the outputted variable from the curl | grep | sed chain? When it is echo'd to stdout or >> to a file, things look fine (a single ID with no odd characters or carriage returns etc.).

任何提示或替代解决方案将不胜感激.谢谢！

Any hints or alternate solutions would be much appreciated. Thanks!

仅供参考:

OSX:$bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin14)

我正在卷曲的 html 文件中塞满了回车.使用 set -x 运行脚本很有帮助，因为它揭示了真正的字符串:$'CVE-2011-2716 '.

The html file that I was curl'ing was chock full of carriage returns. Running the script with set -x was helpful because it revealed the true string being grepped: $'CVE-2011-2716 '.

+ read -r link
+ curl -s http://localhost:8080/link1.html
+ sed -n '/CVE-/s/<[^>]*>//gp'
+ read -r cve
+ grep -q -F $'CVE-2011-2716
' ./kernelChangelog.txt

同样从另一个角度调查，在vim中打开curled文件显示^M并执行printf %s "$cve" |xxd 还显示了附加到 grep'd 变量的回车十六进制代码 0d.依靠回声"标准输出是一种错误的诊断方式.使用有效的 CVE-####-#### 编写一个简单的 html 页面，然后添加回车(在 vim 插入模式下只需键入 ctrl-v ctrl-m 以插入回车)将创建一个示例文件上面的原始脚本片段失败了.

Also investigating from another angle, opening the curled file in vim showed ^M and doing a printf %s "$cve" | xxd also showed the carriage return hex code 0d appended to the grep'd variable. Relying on 'echo' stdout was a wrong way of diagnosing things. Writing a simple html page with a valid CVE-####-####, but then adding a carriage return (in vim insert mode just type ctrl-v ctrl-m to insert the carriage return) will create a sample file that fails with the original script snippet above.

这是我应该想出的非常标准的字符串清理内容.解决方案是删除回车，管道到 tr -d ' ' 是这样做的一种方法.我不确定这一系列步骤在 SO 上是否有特定的重复项，但无论如何这里是我现在的工作脚本:

This is pretty standard string sanitization stuff that I should have figured out. The solution is to remove carriage returns, piping to tr -d ' ' is one method of doing that. I'm not sure there is a specific duplicate on SO for this series of steps, but in any case here is my now working script:

while read -r link; do
  curl -s "$link" | sed -n '/CVE-/s/<[^>]*>//gp' | tr -d '
' | while read -r cve; do
    if grep -q -F "$cve" ./changelog.txt; then
      echo "FOUND: $cve";
    else
      echo "NOT FOUND: $cve";
    fi;
  done
done < links.txt

使用变量的bash脚本grep无法找到实际存在的结果 [英] bash script grep using variable fails to find result that actually does exist

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用变量的bash脚本grep无法找到实际存在的结果 [英] bash script grep using variable fails to find result that actually does exist

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭