在Python中解析以制表符分隔的文件 [英] parsing a tab-separated file in Python

查看:1054
本文介绍了在Python中解析以制表符分隔的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在Python中解析一个以制表符分隔的文件,其中一个数字与行的开头相距k个标签,应放在第k个数组中。

I'm trying to parse a tab-separated file in Python where a number placed k tabs apart from the beginning of a row, should be placed into the k-th array.

是否有内置函数来执行此操作,或者更好的方法,除了逐行读取并执行一个天真的解决方案将执行的所有明显处理?

Is there a built-in function to do this, or a better way, other than reading line by line and do all the obvious processing a naive solution would perform?

推荐答案

您可以使用 csv 模块可轻松解析标签分隔值文件。

You can use the csv module to parse tab seperated value files easily.

import csv

with open("tab-separated-values") as tsv:
    for line in csv.reader(tsv, dialect="excel-tab"): #You can also use delimiter="\t" rather than giving a dialect.
        ... 

其中是每次迭代的当前行的值列表。

Where line is a list of the values on the current row for each iteration.

编辑:如下所示,如果要按列读取,而不是按行读取,则最好的办法是使用 zip()内置:

As suggested below, if you want to read by column, and not by row, then the best thing to do is use the zip() builtin:

with open("tab-separated-values") as tsv:
    for column in zip(*[line for line in csv.reader(tsv, dialect="excel-tab")]):
        ...

这篇关于在Python中解析以制表符分隔的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆