Snakemake InputFunctionException.AttributeError:“通配符"对象没有属性 [英] Snakemake InputFunctionException. AttributeError: 'Wildcards' object has no attribute

查看:48
本文介绍了Snakemake InputFunctionException.AttributeError:“通配符"对象没有属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有 ChIP-seq 单端 fastq 文件名的列表对象 allfiles=['/path/file1.fastq','/path/file2.fastq','/path/file3.fastq'] .我正在尝试将该对象 allfiles 设置为通配符(我想要输入 fastqc 规则(以及其他规则,例如映射,但让我们保持简单).我尝试了下面的代码(lambda 通配符:data.loc[(wildcards.sample),'read1']).但是,这给了我错误

I have a list object with ChIP-seq single-end fastq file names allfiles=['/path/file1.fastq','/path/file2.fastq','/path/file3.fastq'] . I'm trying to set that object, allfiles, as a wildcard (I want the input of the fastqc rule (and others such as mapping, but let's keep it simple). I tried what is seen in the code below (lambda wildcards: data.loc[(wildcards.sample),'read1']). This, however, is giving me the error

"InputFunctionException in line 118 of Snakefile:
AttributeError: 'Wildcards' object has no attribute 'sample'
Wildcards:
" 

有人知道如何定义它吗?看来我很接近了,我得到了大致的想法,但我无法正确地获得语法并执行它.谢谢!

Does someone know exactly how to define it? It seems I am close, I get the general idea but I am failing to get the syntax correct and execute it. Thank you !

代码:

import pandas as pd
import numpy as np

# Read in config file parameters
configfile: 'config.yaml'
sampleFile = config['samples'] # three columns: sample ID , /path/to/chipseq_file_SE.fastq , /path/to/chipseq_input.fastq
outputDir = config['outputdir'] # output directory

outDir = outputDir + "/MyExperiment"
qcDir = outDir + "/QC"

# Read in the samples table
data = pd.read_csv(sampleFile, header=0, names=['sample', 'read1', 'inputs']).set_index('sample', drop=False)
samples = data['sample'].unique().tolist() # sample IDs
read1 = data['read1'].unique().tolist() # ChIP-treatment file single-end file
inplist= data['inputs'].unique().tolist() # the ChIP-input files
inplistUni= data['inputs'].unique().tolist() # the ChIP-input files (unique)
allfiles = read1 + inplistUni

# Target rule
rule all:
    input:
        expand(f'{qcDir}' + '/raw/{sample}_fastqc.html', sample=samples),
        expand(f'{qcDir}' + '/raw/{sample}_fastqc.zip', sample=samples),

# fastqc report generation
rule fastqc:
    input: lambda wildcards: data.loc[(wildcards.sample), 'read1']
    output:
        html=expand(f'{qcDir}' + '/raw/{sample}_fastqc.html',sample=samples) ,
        zip=expand(f'{qcDir}' + '/raw/{sample}_fastqc.zip',sample=samples)
    log: expand(f'{logDir}' + '/qc/{sample}_fastqc_raw.log',sample=samples)
    threads: 4
    wrapper: "fastqc {input} 2>> {log}"

推荐答案

当前 rule fastqcoutput 文件在解析后没有任何通配符.也就是说,蛇文件中当前有一项工作,其中 rule fastqc 尝试为所有样本生成一个输出文件.

Currently output files of rule fastqc doesn't have any wildcards once they are resolved. That is, there is currently one job in the snakefile where rule fastqc tries to produce one output file for all samples.

但是,您似乎希望为每个样本单独运行 rule fastqc.在这种情况下,它需要概括如下,其中 {sample} 是通配符:

However, it appears you would like to run rule fastqc separately for each sample. In that case, it needs to be generalized as below, where {sample} is the wildcard:

rule fastqc:
    input: lambda wildcards: data.loc[(wildcards.sample), 'read1']
    output:
        html = qcDir + '/raw/{sample}_fastqc.html,
        zip=qcDir + '/raw/{sample}_fastqc.zip'
    log: logDir + '/qc/{sample}_fastqc_raw.log'
    threads: 4
    shell: "fastqc {input} 2>> {log}"

这篇关于Snakemake InputFunctionException.AttributeError:“通配符"对象没有属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆