分离一个文本文件的各个部分用bash脚本 [英] Separating sections of a text file with a bash script
问题描述
我有一个列表:
### To Read:
One Hundred Years of Solitude | Gabriel García Márquez
Moby-Dick | Herman Melville
Frankenstein | Mary Shelley
On the Road | Jack Kerouac
Eyeless in Gaza | Aldous Huxley
### Read:
The Name of the Wind (The Kingkiller Chronicles: Day One) | Patrick Rothfuss | 6-27-2013
The Wise Man’s Fear (The Kingkiller Chronicles: Day Two) | Patrick Rothfuss | 8-4-2013
Vampires in the Lemon Grove | Karen Russell | 12-25-2013
Brave New World | Aldous Huxley | 2-2014
我想使用类似Python的 string.split('|')
来分隔各个领域为单独的字符串,但由于这两个部分具有不同的号码领域,我想我需要区别对待。我怎么去之间选择线'###改为:和###阅读:'后'###阅读:和他们的分裂?我应该用awk或sed的?
I'd like to use something like python's string.split(' | ')
to separate the various fields into separate strings, but since the two sections have different numbers of fields, I think I need to treat them differently. How do I go about selecting the lines in between '### To Read:' and '### Read:' and after '### Read:' and splitting them? Should I use awk or sed?
推荐答案
您没有告诉我们如何提供最终的输出,但这里是一个awk解决方案的框架。
You are not telling us how to deliver the final output, but here is a skeleton for an Awk solution.
awk -F ' \| ' '/^### To read:/ { s=1; next }
/^### Read:/ { s=2; next }
s==1 { print $1 "," $2 ",\"\"" }
s == 2 { print $1 "," $2 "," $3 }' file
这会简单地从第一款打印一个空的第三个字段。可以很明显的调整动作是任何你喜欢的,或者重写这个在Python如果您更熟悉。
This will simply print an empty third field from the first subsection. You can obviously adapt the actions to be anything you like, or rewrite this in Python if you are more familiar with that.
这篇关于分离一个文本文件的各个部分用bash脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!