在不同行的两个字符串之间提取文本 [英] Extract text between two strings on different lines

查看：73 发布时间：2020/9/15 6:06:32 bash awk sed

本文介绍了在不同行的两个字符串之间提取文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个很大的电子邮件文件，其中包含以下随机主机:

I have a big email file with the following random hosts:

......
HOSTS: test-host,host2.domain.com,
host3.domain.com,another-testing-host,host.domain.
com,host.anotherdomain.net,host2.anotherdomain.net,
another-local-host, TEST-HOST

DATE: August 11 2015 9:00
.......

主机总是用逗号分隔，但是它们可以拆分成一行，两行或多行(不幸的是，我无法控制，这是电子邮件客户端所做的事情.)

The hosts are always delimited with a comma but they can be split on one, two or multiple lines (I can't control this, it's what email clients do, unfortunately).

因此，我需要提取字符串"HOSTS:"和字符串"DATE:"之间的所有文本，将其包装，并用新行替换逗号，如下所示:

So I need to extract all the text between the string "HOSTS:" and the string "DATE:", wrap it, and replace the commas with new lines, like this:

test-host
host2.domain.com
host3.domain.com
another-testing-host
host.domain.com
host.anotherdomain.net
host2.anotherdomain.net
another-local-host
TEST-HOST

到目前为止，我想到了这一点，但是我丢失了与主机"相同的所有内容:

So far I came up with this, but I lose everything that's on the same line with "HOSTS":

sed '/HOST/,/DATE/!d;//d' ${file} | tr -d '\n' | sed -E "s/,\s*/\n/g"

推荐答案

类似的方法可能对您有用:

Something like this might work for you:

sed -n '/HOSTS:/{:a;N;/DATE/!ba;s/[[:space:]]//g;s/,/\n/g;s/.*HOSTS:\|DATE.*//g;p}' "$file"

故障:

-n                       # Disable printing
/HOSTS:/ {               # Match line containing literal HOSTS:
  :a;                    # Label used for branching (goto)
  N;                     # Added next line to pattern space
  /DATE/!ba              # As long as literal DATE is not matched goto :a
  s/.*HOSTS:\|DATE.*//g; # Remove everything in front of and including literal HOSTS:
                         # and remove everything behind and including literal DATE 
  s/[[:space:]]//g;      # Replace spaces and newlines with nothing
  s/,/\n/g;              # Replace comma with newline
  p                      # Print pattern space
}

这篇关于在不同行的两个字符串之间提取文本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在不同行的两个字符串之间提取文本 [英] Extract text between two strings on different lines

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在不同行的两个字符串之间提取文本 [英] Extract text between two strings on different lines

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭