在crontab上创建一个命令流编辑器,并每15分钟重写一次文件 [英] Create a command on crontab that stream editor and rewrite a file every 15 minutes

查看:51
本文介绍了在crontab上创建一个命令流编辑器,并每15分钟重写一次文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个文件,该文件的格式可以匹配到另一个文件:

Let's say I have a file with patterns to match into another file:

file_names.txt

pfg022G
pfg022T
pfg068T
pfg130T
pfg181G
pfg181T
pfg424G
pfg424T

我想使用 file_names.txt 并在 example.conf 中使用 sed 命令:

I would like to use file_names.txt and use sed command into example.conf:

example.conf

{
  "ExomeGermlineSingleSample.sample_and_unmapped_bams": {
    "flowcell_unmapped_bams": ["/groups/cgsd/alexandre/gatk-workflows/src/ubam/pfg022G.unmapped.bam"],
    "unmapped_bam_suffix": ".unmapped.bam",
    "sample_name": "pfg022G",
    "base_file_name": "pfg022G.GRCh38DH.target",
    "final_gvcf_base_name": "pfg022G.GRCh38DH.target"
  },

sed命令会将 example.conf 上的 pfg022G 替换为 pfg022T ,这是 file_names.txt ( sed s/pfg022G/pfg022T/).此时的 example.conf 应该看起来像这样:

The sed command would replace pfg022G on example.conf with pfg022T, which is the next item in file_names.txt (sed s/pfg022G/pfg022T/). The example.conf at this point should look like this:

{
  "ExomeGermlineSingleSample.sample_and_unmapped_bams": {
    "flowcell_unmapped_bams": ["/groups/cgsd/alexandre/gatk-workflows/src/ubam/pfg022T.unmapped.bam"],
    "unmapped_bam_suffix": ".unmapped.bam",
    "sample_name": "pfg022T",
    "base_file_name": "pfg022T.GRCh38DH.target",
    "final_gvcf_base_name": "pfg022T.GRCh38DH.target"
  },

15分钟后,应将 pfg022T 替换为 pfg068T ,依此类推,直到 file_names.txt 中的所有项目都用完为止.

After 15 minutes the substitution should be pfg022T to pfg068T and so on until all the items in file_names.txt are exhausted.

推荐答案

以下crontab将每15分钟运行一次脚本:

The following crontab would run your script every 15 minutes:

# Example of job definition:
# .---------------- minute (0 - 59)
# |  .------------- hour (0 - 23)
# |  |  .---------- day of month (1 - 31)
# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...
# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7)
# |  |  |  |  |
# *  *  *  *  *   command to be executed
 15  *  *  *  *   /path/to/script

阅读 script

#!/usr/bin/env sh
file1="file_names.txt"
file2="example.conf"
sed -i -e "$(awk '(NR>1){print "s/"p"/"$1"/g"}{p=$1}' $file1 | tac)" example.conf

我们在这里使用的技巧是进行替代.文件 example.conf 始终仅包含一个字符串,该字符串也在"file_names.txt"中.因此,如果您尝试从最后一个替换到前面,则只会进行一次替换.

The trick we use here is to do revere substitution. The file example.conf always contains only one string which is also in "file_names.txt". So if you attempt to substitute from the last to the front you will only do a single substitution.

我们在这里使用 awk 来构建 sed -script和 tac 来反转它,因此我们只有一个匹配项:

We use awk here to build a sed-script and tac to reverse it so that we only have a single match:

$ awk '(NR>1){print "s/"p"/"$1"/g"}{p=$1}' $file_names.txt
s/pfg022G/pfg022T/g
s/pfg022T/pfg068T/g
s/pfg068T/pfg130T/g
s/pfg130T/pfg181G/g
s/pfg181G/pfg181T/g
s/pfg181T/pfg424G/g
s/pfg424G/pfg424T/g

如果使用上述脚本执行 sed ,我们将始终以 pfg424T (最后一个条目)结束,因为它将找到一个匹配项(假设我们是在第三个条目 pfg068T )中,因此sed将在此之后执行所有替换.但是,当我们颠倒顺序(使用 tac )时, sed 只会找到一个匹配项.

If we do a sed with the above script, we will always end up with pfg424T (the last entry) as it will find a single match (assume we are in the third entry pfg068T), so sed will perform every substitution after that. However, when we reverse the order (using tac), sed will only find a single match.

这篇关于在crontab上创建一个命令流编辑器,并每15分钟重写一次文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆