在crontab上创建一个命令流编辑器,并每15分钟重写一次文件 [英] Create a command on crontab that stream editor and rewrite a file every 15 minutes
问题描述
假设我有一个文件,该文件的格式可以匹配到另一个文件:
Let's say I have a file with patterns to match into another file:
file_names.txt
pfg022G
pfg022T
pfg068T
pfg130T
pfg181G
pfg181T
pfg424G
pfg424T
我想使用 file_names.txt
并在 example.conf
中使用 sed
命令:
I would like to use file_names.txt
and use sed
command into example.conf
:
example.conf
{
"ExomeGermlineSingleSample.sample_and_unmapped_bams": {
"flowcell_unmapped_bams": ["/groups/cgsd/alexandre/gatk-workflows/src/ubam/pfg022G.unmapped.bam"],
"unmapped_bam_suffix": ".unmapped.bam",
"sample_name": "pfg022G",
"base_file_name": "pfg022G.GRCh38DH.target",
"final_gvcf_base_name": "pfg022G.GRCh38DH.target"
},
sed命令会将 example.conf
上的 pfg022G
替换为 pfg022T
,这是 file_names.txt
( sed s/pfg022G/pfg022T/
).此时的 example.conf
应该看起来像这样:
The sed command would replace pfg022G
on example.conf
with pfg022T
, which is the next item in file_names.txt
(sed s/pfg022G/pfg022T/
). The example.conf
at this point should look like this:
{
"ExomeGermlineSingleSample.sample_and_unmapped_bams": {
"flowcell_unmapped_bams": ["/groups/cgsd/alexandre/gatk-workflows/src/ubam/pfg022T.unmapped.bam"],
"unmapped_bam_suffix": ".unmapped.bam",
"sample_name": "pfg022T",
"base_file_name": "pfg022T.GRCh38DH.target",
"final_gvcf_base_name": "pfg022T.GRCh38DH.target"
},
15分钟后,应将 pfg022T
替换为 pfg068T
,依此类推,直到 file_names.txt
中的所有项目都用完为止.
After 15 minutes the substitution should be pfg022T
to pfg068T
and so on until all the items in file_names.txt
are exhausted.
推荐答案
以下crontab将每15分钟运行一次脚本:
The following crontab would run your script every 15 minutes:
# Example of job definition:
# .---------------- minute (0 - 59)
# | .------------- hour (0 - 23)
# | | .---------- day of month (1 - 31)
# | | | .------- month (1 - 12) OR jan,feb,mar,apr ...
# | | | | .---- day of week (0 - 6) (Sunday=0 or 7)
# | | | | |
# * * * * * command to be executed
15 * * * * /path/to/script
阅读 script
#!/usr/bin/env sh
file1="file_names.txt"
file2="example.conf"
sed -i -e "$(awk '(NR>1){print "s/"p"/"$1"/g"}{p=$1}' $file1 | tac)" example.conf
我们在这里使用的技巧是进行替代.文件 example.conf
始终仅包含一个字符串,该字符串也在"file_names.txt"中.因此,如果您尝试从最后一个替换到前面,则只会进行一次替换.
The trick we use here is to do revere substitution. The file example.conf
always contains only one string which is also in "file_names.txt". So if you attempt to substitute from the last to the front you will only do a single substitution.
我们在这里使用 awk
来构建 sed
-script和 tac
来反转它,因此我们只有一个匹配项:>
We use awk
here to build a sed
-script and tac
to reverse it so that we only have a single match:
$ awk '(NR>1){print "s/"p"/"$1"/g"}{p=$1}' $file_names.txt
s/pfg022G/pfg022T/g
s/pfg022T/pfg068T/g
s/pfg068T/pfg130T/g
s/pfg130T/pfg181G/g
s/pfg181G/pfg181T/g
s/pfg181T/pfg424G/g
s/pfg424G/pfg424T/g
如果使用上述脚本执行 sed
,我们将始终以 pfg424T
(最后一个条目)结束,因为它将找到一个匹配项(假设我们是在第三个条目 pfg068T
)中,因此sed将在此之后执行所有替换.但是,当我们颠倒顺序(使用 tac
)时, sed
只会找到一个匹配项.
If we do a sed
with the above script, we will always end up with pfg424T
(the last entry) as it will find a single match (assume we are in the third entry pfg068T
), so sed will perform every substitution after that. However, when we reverse the order (using tac
), sed
will only find a single match.
这篇关于在crontab上创建一个命令流编辑器,并每15分钟重写一次文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!