IndexError:列表索引超出范围,不确定原因 [英] IndexError: list index out of range, not sure why

查看:160
本文介绍了IndexError:列表索引超出范围,不确定原因的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望程序执行的操作是获取与某个条形码相关的序列并执行定义的功能(序列的平均长度和标准差,减去条形码和无关的txt,由同一条形码标识).我已经写了类似的东西,并基于类似的程序,但我一直遇到索引错误.这个想法是所有带有第一个条形码的序列都将被处理为barcodeCounter = 0,第二个将被处理为barcodeCounter = 1,依此类推.

输入:

import sys
import math

def avsterr(x):
        ave = sum(x)/len(x)
        ssq = 0.0
        for y in x:
                ssq += (y-ave)*(y-ave)
        var = ssq / (len(x)-1)
        sdev = math.sqrt(var)
        stderr = sdev / math.sqrt(len(x))

        return (ave,stderr)

barcode = sys.argv[1]
sequence = sys.argv[2]
lengths = []
toprocess = []
b = open(barcode,"r")
barcodeCounter = 0
for barcode in b:
        barcodeCounter = barcodeCounter + 1
        barcode = barcode.strip()
        print "barcode: %s" %  barcode
        handle = open(sequence, "r")
        for line in handle:
                print line
                seq = line.split(' ',1)[-1].strip()
                print "seq: %s" % seq
                potential_barcode = seq[0:len(barcode)]
                print "something"
                if potential_barcode == barcode:
                        print "Checking sequences"
                        outseq = seq.replace(potential_barcode, "", 1)
                        outseq_length = [len(outseq)]
#                       toprocess.append("")
#                       toprocess[barcodeCounter] += outseq.strip
                        toprocess[barcodeCounter].extend(outseq.strip)   #IndexError/line40
#                       toprocess[barcodeCounter] = toprocess[barcodeCounter] + outseq.strip
                        print "outseq: %s" % outseq
                        print "Barcodes to be processed: %s" % toprocess[barcodeCounter]
                        print "BC: %i" % barcodeCounter
        handle.close()
b.close()
one = len(toprocess[0])
#two = lengths[2]
#three = lengths[3]
print one
#(av,st) = avsterr(lengths)
#print "%f +/- %f" % (av,st)

输出:

 barcode: ATTAG
S01 ATTAGAAAAAAA

seq: ATTAGAAAAAAA
something
Checking sequences
Traceback (most recent call last):
  File "./FinalProject.py", line 40, in <module>
    toprocess[barcodeCounter].extend(outseq.strip)
IndexError: list index out of range
 

这是我基于的代码.

sequenceCounter = -1
for line in handle:
        if line[0] == ">":
                sequenceCounter = sequenceCounter + 1
#               print "seqid %s\n" % line
                seqidList.append(line)
                seqList.append("")
        if line[0] != ">":
                seqList[sequenceCounter] = seqList[sequenceCounter] + line.strip()

添加了枚举功能并注释掉了BarcodeCounter内容.

barcode = sys.argv[1]
sequence = sys.argv[2]
lengths = []
toprocess = []
b = open(barcode,"r")
#barcodeCounter = -1
for barcodeCounter, barcode in enumerate(b):
#       barcodeCounter = barcodeCounter + 1
        barcode = barcode.strip()
        print "barcode: %s" %  barcode
        handle = open(sequence, "r")
        for line in handle:
                print line
                seq = line.split(' ',1)[-1].strip()
                print "seq: %s" % seq
                potential_barcode = seq[0:len(barcode)]
                print "something"
                if potential_barcode == barcode:
                        print "Checking sequences"
                        outseq = seq.replace(potential_barcode, "", 1)
                        outseq_length = [len(outseq)]
                        toprocess.append("")
#                       toprocess[barcodeCounter] += outseq.strip
                        toprocess[barcodeCounter].append(outseq.strip) #AttributeError line 40
#                       toprocess[barcodeCounter] = toprocess[barcodeCounter] + outseq.strip
                        print "outseq: %s" % outseq
                        print "Barcodes to be processed: %s" % toprocess[barcodeCounter]
                        print "BC: %i" % barcodeCounter
        handle.close()
b.close()

新错误:

 barcode: ATTAG
S01 ATTAGAAAAAAA

seq: ATTAGAAAAAAA
something
Checking sequences
Traceback (most recent call last):
  File "./FinalProject.py", line 40, in <module>
    toprocess[barcodeCounter].append(outseq.strip)
AttributeError: 'str' object has no attribute 'append'
 

没有问题的代码:

barcode = sys.argv[1]
sequence = sys.argv[2]
lengths = []
toprocess = []
b = open(barcode,"r")
#barcodeCounter = -1
for barcodeCounter, barcode in enumerate(b):
#       barcodeCounter = barcodeCounter + 1
        barcode = barcode.strip()
        print "barcode: \n%s\n" %  barcode
        handle = open(sequence, "r")
        for line in handle:
                print line
                seq = line.split(' ',1)[-1].strip()
                print "seq: %s" % seq
                potential_barcode = seq[0:len(barcode)]
#               print "something"
                if potential_barcode == barcode:
                        print "Checking sequences"
                        outseq = seq.replace(potential_barcode, "", 1)
                        outseq_length = [len(outseq)]
                        toprocess.append("")
                        toprocess[barcodeCounter] = toprocess[barcodeCounter] + outseq

@abarnert您有帮助,谢谢.有时候(大多数时候),我并不是最聪明的人.我还必须更改添加新序列的方式,因为它们是str而不是list.

解决方案

您实际上在这里有两个问题.


首先,您从1而不是0开始计数.从0开始barcodeCounter,然后在使用前递增它.这意味着,如果您有3条条形码,则尝试设置toprocess[1]toprocess[2]toprocess[3],最后一个将是IndexError.

请注意,基于它的代码以sequenceCounter = -1而不是0开头,以避免出现此问题.

但是,有一个更简单的解决方案:使用enumerate为您进行计数:

for barcodeCounter, barcode in enumerate(b):

无需记住是从-1、0或1开始,还是在何处进行递增,或者其中的任何一个;它只会自动获取数字0、1、2等,直到len(b)-1.


第二,即使计数正确,toprocess的大小也不同于b.实际上,它完全是空的,因此toprocess[anything]总是 会引发异常.

要将新值附加到list的末尾,请调用append方法:

toprocess.append(…)

同样,请注意,基于它的代码在执行seqList[sequenceCounter] =之前始终会执行seqList.append(""). (请注意,这有点棘手-有时append s并递增sequenceCounter,有时却不执行,并使用sequenceCounter的先前值分配给seqList[sequenceCounter].)您必须执行等效操作.

What I would like the program to do is to take sequences related to a certain barcode and perform the defined function (average length and standard deviation of sequences, minus the barcode and non-relevant txt, identified by the same barcode). I have written something similar and based it off the similar program but I keep getting an indexerror. The idea is that all the sequences with the first barcode will be processed as barcodeCounter = 0 and the second one as barcodeCounter = 1, etc. Hopefully that is enough info, sorry if it is messy.

Input:

import sys
import math

def avsterr(x):
        ave = sum(x)/len(x)
        ssq = 0.0
        for y in x:
                ssq += (y-ave)*(y-ave)
        var = ssq / (len(x)-1)
        sdev = math.sqrt(var)
        stderr = sdev / math.sqrt(len(x))

        return (ave,stderr)

barcode = sys.argv[1]
sequence = sys.argv[2]
lengths = []
toprocess = []
b = open(barcode,"r")
barcodeCounter = 0
for barcode in b:
        barcodeCounter = barcodeCounter + 1
        barcode = barcode.strip()
        print "barcode: %s" %  barcode
        handle = open(sequence, "r")
        for line in handle:
                print line
                seq = line.split(' ',1)[-1].strip()
                print "seq: %s" % seq
                potential_barcode = seq[0:len(barcode)]
                print "something"
                if potential_barcode == barcode:
                        print "Checking sequences"
                        outseq = seq.replace(potential_barcode, "", 1)
                        outseq_length = [len(outseq)]
#                       toprocess.append("")
#                       toprocess[barcodeCounter] += outseq.strip
                        toprocess[barcodeCounter].extend(outseq.strip)   #IndexError/line40
#                       toprocess[barcodeCounter] = toprocess[barcodeCounter] + outseq.strip
                        print "outseq: %s" % outseq
                        print "Barcodes to be processed: %s" % toprocess[barcodeCounter]
                        print "BC: %i" % barcodeCounter
        handle.close()
b.close()
one = len(toprocess[0])
#two = lengths[2]
#three = lengths[3]
print one
#(av,st) = avsterr(lengths)
#print "%f +/- %f" % (av,st)

Output:

barcode: ATTAG
S01 ATTAGAAAAAAA

seq: ATTAGAAAAAAA
something
Checking sequences
Traceback (most recent call last):
  File "./FinalProject.py", line 40, in <module>
    toprocess[barcodeCounter].extend(outseq.strip)
IndexError: list index out of range

This is the code I'm basing it on.

sequenceCounter = -1
for line in handle:
        if line[0] == ">":
                sequenceCounter = sequenceCounter + 1
#               print "seqid %s\n" % line
                seqidList.append(line)
                seqList.append("")
        if line[0] != ">":
                seqList[sequenceCounter] = seqList[sequenceCounter] + line.strip()

EDIT: Added the enumerate function and commented out barcodeCounter stuff.

barcode = sys.argv[1]
sequence = sys.argv[2]
lengths = []
toprocess = []
b = open(barcode,"r")
#barcodeCounter = -1
for barcodeCounter, barcode in enumerate(b):
#       barcodeCounter = barcodeCounter + 1
        barcode = barcode.strip()
        print "barcode: %s" %  barcode
        handle = open(sequence, "r")
        for line in handle:
                print line
                seq = line.split(' ',1)[-1].strip()
                print "seq: %s" % seq
                potential_barcode = seq[0:len(barcode)]
                print "something"
                if potential_barcode == barcode:
                        print "Checking sequences"
                        outseq = seq.replace(potential_barcode, "", 1)
                        outseq_length = [len(outseq)]
                        toprocess.append("")
#                       toprocess[barcodeCounter] += outseq.strip
                        toprocess[barcodeCounter].append(outseq.strip) #AttributeError line 40
#                       toprocess[barcodeCounter] = toprocess[barcodeCounter] + outseq.strip
                        print "outseq: %s" % outseq
                        print "Barcodes to be processed: %s" % toprocess[barcodeCounter]
                        print "BC: %i" % barcodeCounter
        handle.close()
b.close()

New error:

barcode: ATTAG
S01 ATTAGAAAAAAA

seq: ATTAGAAAAAAA
something
Checking sequences
Traceback (most recent call last):
  File "./FinalProject.py", line 40, in <module>
    toprocess[barcodeCounter].append(outseq.strip)
AttributeError: 'str' object has no attribute 'append'

Code without the issue:

barcode = sys.argv[1]
sequence = sys.argv[2]
lengths = []
toprocess = []
b = open(barcode,"r")
#barcodeCounter = -1
for barcodeCounter, barcode in enumerate(b):
#       barcodeCounter = barcodeCounter + 1
        barcode = barcode.strip()
        print "barcode: \n%s\n" %  barcode
        handle = open(sequence, "r")
        for line in handle:
                print line
                seq = line.split(' ',1)[-1].strip()
                print "seq: %s" % seq
                potential_barcode = seq[0:len(barcode)]
#               print "something"
                if potential_barcode == barcode:
                        print "Checking sequences"
                        outseq = seq.replace(potential_barcode, "", 1)
                        outseq_length = [len(outseq)]
                        toprocess.append("")
                        toprocess[barcodeCounter] = toprocess[barcodeCounter] + outseq

@abarnert You were helpful, thank you. I'm not the brightest when it comes to programming sometimes(most the time). I had to also change the way I added the new sequences because they are str not list.

解决方案

You actually have two problems here.


First, you're counting from 1 instead of 0. You start barcodeCounter at 0, then you increment it before using it. This means that if you have, say, 3 barcodes, you're trying to set toprocess[1], then toprocess[2], then toprocess[3], and the last one is going to be an IndexError.

Notice that the code you based it on starts with sequenceCounter = -1 rather than 0 to avoid this problem.

However, there's an even simpler solution to the problem: use enumerate to do the counting for you:

for barcodeCounter, barcode in enumerate(b):

No need to remember whether to start at -1, 0, or 1, or where to do the incrementing, or any of that; it just automatically gets the numbers 0, 1, 2, etc. up to len(b)-1.


Second, even if you counted correctly, toprocess is not the same size as b. In fact, it's completely empty, so toprocess[anything] is always going to raise an exception.

To append a new value to the end of a list, you call the append method:

toprocess.append(…)

Again, notice that the code you're basing it on always does a seqList.append("") before doing a seqList[sequenceCounter] =. (Notice that it's a bit tricky—sometimes it appends and increments sequenceCounter, sometimes it does neither, and assigns to seqList[sequenceCounter] using the previous value of sequenceCounter.) You have to do the equivalent.

这篇关于IndexError:列表索引超出范围,不确定原因的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆