如何从文件中提取多行扩展的电子邮件标题 [英] How to extract email headers extending on multiple lines from file
本文介绍了如何从文件中提取多行扩展的电子邮件标题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试在Linux上使用sed从电子邮件文件中提取To标头.
I am trying to extract the To header from an email file using sed on linux.
问题在于To标头可能在多行上.
The problem is that the To header could be on multiple lines.
例如:
To: name1@mydomain.org, name2@mydomain.org,
name3@mydomain.org, name4@mydomain.org,
name5@mydomain.org
Message-ID: <46608700.369886.1549009227948@domain.org>
我尝试了以下操作:
sed -n -e '/^[Tt]o: / { N; p; }' _message_file_ |
awk '{$1=$1;printf("%s ",$0)};NR%2==0{print ""}'
sed命令提取以To和下一行开头的行. 我将输出通过管道传输到awk,以将所有内容都放在一行中.
The sed command extracts the line starting with To and next line. I pipe the output to awk to put everything on a single line.
完整命令在一行中输出:
The full command outputs in one line:
To: name1@mydomain.org, name2@mydomain.org, name3@mydomain.org, name4@mydomain.org
我不知道如何继续测试下一行是否以空格开头并将其添加到结果中.
I don't know how to keep going and test if the next line starts with whitespace and add it to the result.
我想要的是所有地址
To: name1@mydomain.org, name2@mydomain.org, name3@mydomain.org, name4@mydomain.org, name5@mydomain.org
任何帮助将不胜感激.
推荐答案
formail
是一个很好的解决方案,但以下是使用sed的方法:
formail
is a good solution but here's how to do it with sed:
sed -e '/^$/q;/^To:/!d;n;:c;/^\s/!d;n;bc' message_file
-
/^$/q;
-(可选)如果标头用尽,则退出 -
/^To:/!d;
-如果不是To:标头,则停止处理此行 -
n;
-否则,隐式打印它并加载下一行 -
:c;
-c是我们可以分支到的标签 -
/^\s/!d;
-如果不是继续,请停止处理此行 -
n;
-否则,隐式打印它并加载下一行 -
bc
-分支回到标签c(即循环) /^$/q;
- (optional) quit if we run out of headers/^To:/!d;
- if not a To: header, stop processing this linen;
- otherwise, implicitly print it, and load next line:c;
- c is a label we can branch to/^\s/!d;
- if not a contination, stop processing this linen;
- otherwise, implicitly print it, and load next linebc
- branch back to label c (ie. loop)
这篇关于如何从文件中提取多行扩展的电子邮件标题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文