STAR索引发布到步骤4 [英] STAR index issues to the step 4
问题描述
我正在尝试使用snakemake索引管道上的参考基因组,并制定了以下规则:
I am trying to index a reference genome on a pipeline with snakemake and I made this rule:
rule reference_faidx_star:
input:
"../resources/reference/Qrob_PM1N.fa"
output:
"../resources/reference/ref/"
threads: 1
log:
"../results/logs/star/star_index.log"
params:
gtf= "../resources/reference/gff_gtf/Qrob_PM1N_genes_20161004.gtf"
# resources:
# mem_mb=25000
message:
"""
INDEX STAR
"""
shell:
"STAR --runMode genomeGenerate --runThreadN {threads} --genomeDir {output} --genomeFastaFiles {input} --sjdbGTFfile {params.gtf} --sjdbOverhang 149 --genomeSAindexNbases 12 " # Logging
起初一切正常,但第4步有一个中断.在我的文件夹中仅创建了4个文件:chrLength.txt,chrNameLength.txt,chrName.txt,chrStart.txt,并且终端显示以下内容:
At first everything works but there is a break at step 4. Only 4 files are created in my folder: chrLength.txt, chrNameLength.txt, chrName.txt, chrStart.txt, and the terminal displays this:
[Tue Apr 20 09:09:30 2021]
Job 4:
INDEX STAR
Apr 20 09:09:30 ..... started STAR run
Apr 20 09:09:30 ... starting to generate Genome files
Apr 20 09:09:50 ... starting to sort Suffix Array. This may take a long time...
Apr 20 09:09:55 ... sorting Suffix Array chunks and saving them to disk...
/usr/bin/bash : ligne 1 : 7343 Processus arrêté STAR --runMode genomeGenerate --runThreadN 1 --genomeDir ../resources/reference/ref/ --genomeFastaFiles ../resources/reference/Qrob_PM1N.fa --sjdbGTFfile ../resources/reference/gff_gtf/Qrob_PM1N_genes_20161004.gtf --sjdbOverhang 149 --genomeSAindexNbases 12 -limitGenomeGenerateRAM 25000000000
[Tue Apr 20 09:10:01 2021]
Error in rule reference_faidx_star:
jobid: 4
output: ../resources/reference/ref/
log: ../results/logs/star/star_index.log (check log file(s) for error message)
shell:
STAR --runMode genomeGenerate --runThreadN 1 --genomeDir ../resources/reference/ref/ --genomeFastaFiles ../resources/reference/Qrob_PM1N.fa --sjdbGTFfile ../resources/reference/gff_gtf/Qrob_PM1N_genes_20161004.gtf --sjdbOverhang 149 --genomeSAindexNbases 12
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
我不明白这条规则有什么问题,我不在哪里用bash书写?
I don't understand what is wrong with this rule, where I don't write in bash?
希望您能帮助我.谢谢,祝你有美好的一天!
I hope you can help me. Thank you, Have a nice day!
推荐答案
此问题与Snakemake或Python均无关.日志清楚地向您显示了当 Snakemake 运行管道时 bash 执行的确切命令:
This problem has nothing to do with neither Snakemake, nor Python. The log clearly shows you the exact command that bash executes while Snakemake runs the pipeline:
STAR --runMode genomeGenerate --runThreadN 1 --genomeDir ../resources/reference/ref/ --genomeFastaFiles ../resources/reference/Qrob_PM1N.fa --sjdbGTFfile ../resources/reference/gff_gtf/Qrob_PM1N_genes_20161004.gtf --sjdbOverhang 149 --genomeSAindexNbases 12 -limitGenomeGenerateRAM 25000000000
在执行过程中出了点问题,可能是内存不足,磁盘问题等.尝试在bash中运行此命令并检查返回代码:这可能会为您提供更多有关发生的情况的信息.
Something went wrong during the execution, and that may be insufficient memory, disk problems, etc. Try to run this command in bash and check the return code: that may give you more information of what had happened.
一个有用的Snakemake lifehack是使用-printshellcmds
标志:这将显式向您显示Snakemake运行的所有命令.您可以手动重复这些命令,保留所有临时文件,然后找到问题所在.
One useful Snakemake lifehack is to use --printshellcmds
flag: this would explicitly show you all commands that Snakemake runs. You may repeat these commands manually, leaving all temporary files, and locate the problem.
这篇关于STAR索引发布到步骤4的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!