Biopython从变量而不是文件中解析 [英] Biopython parse from variable instead of file

查看:60
本文介绍了Biopython从变量而不是文件中解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

import gzip
import io
from Bio import SeqIO

infile = "myinfile.fastq.gz"
fileout = open("myoutfile.fastq", "w+")
with io.TextIOWrapper(gzip.open(infile, "r")) as f:
    line = f.read()
fileout.write(line)
fileout.seek(0)

count = 0
for rec in SeqIO.parse(fileout, "fastq"): #parsing from file
    count += 1
print("%i reads" % count)

当"line"写入文件并将该文件输入解析器时,上述方法起作用,但下面的方法不起作用.为什么行不能直接读取?有没有一种方法可以直接将行"馈送到解析器而不必先写入文件?

The above works when "line" is written to a file and that file is feed to the parser, but below does not work. Why can't line be read directly? Is there a way to feed "line" straight to the parser without having to write to a file first?

infile = "myinfile.fastq.gz"
#fileout = "myoutfile.fastq"
with io.TextIOWrapper(gzip.open(infile, "r")) as f:
    line = f.read()
#myout.write(line)

count = 0
for rec in SeqIO.parse(line, "fastq"): #line used instead of writing from file
    count += 1
print("%i reads" % count)

推荐答案

这是因为 SeqIO.parse 仅接受文件处理程序或文件名作为第一个参数.

It's because SeqIO.parse only accepts a file handler or a filename as the first parameter.

如果您想将压缩文件直接读取到 SeqIO.parse 中,只需向其传递一个处理程序即可:

If you want to read a gzipped file directly into SeqIO.parse just pass a handler to it:

import gzip
from Bio import SeqIO

count = 0
with gzip.open("myinfile.fastq.gz") as f:
    for rec in SeqIO.parse(f, "fastq"):
        count += 1

print("{} reads".format(count))

这篇关于Biopython从变量而不是文件中解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆