在Python中解析以制表符分隔的文件 [英] parsing a tab-separated file in Python
问题描述
我正在尝试在Python中解析一个以制表符分隔的文件,其中一个数字与行的开头相距k个标签,应放在第k个数组中。
I'm trying to parse a tab-separated file in Python where a number placed k tabs apart from the beginning of a row, should be placed into the k-th array.
是否有内置函数来执行此操作,或者更好的方法,除了逐行读取并执行一个天真的解决方案将执行的所有明显处理?
Is there a built-in function to do this, or a better way, other than reading line by line and do all the obvious processing a naive solution would perform?
推荐答案
您可以使用 csv
模块可轻松解析标签分隔值文件。
You can use the csv
module to parse tab seperated value files easily.
import csv
with open("tab-separated-values") as tsv:
for line in csv.reader(tsv, dialect="excel-tab"): #You can also use delimiter="\t" rather than giving a dialect.
...
其中行
是每次迭代的当前行的值列表。
Where line
is a list of the values on the current row for each iteration.
编辑:如下所示,如果要按列读取,而不是按行读取,则最好的办法是使用 zip()
内置:
As suggested below, if you want to read by column, and not by row, then the best thing to do is use the zip()
builtin:
with open("tab-separated-values") as tsv:
for column in zip(*[line for line in csv.reader(tsv, dialect="excel-tab")]):
...
这篇关于在Python中解析以制表符分隔的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!