从tsv文件python提取数据 [英] Extract data from tsv file python

查看:4418
本文介绍了从tsv文件python提取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个TSV文件,如下所示:

I have a TSV file, that looks like this:

A   B   C   D   D=1;E=2
S   D   F   G   H=2;B=4

我想以这种方式将内容写入另一个tsv文件。

I'd like to write the contents to another tsv file in this way.

A   B   C   D   D   1
A   B   C   D   E   2
S   D   F   G   H   2
S   D   F   G   B   4

我真的很感激,如果有人可以帮助/暗示我根据需要拆分第5列。

I'd really appreciate if anyone could help/ hint me in splitting column 5 as desired.

推荐答案

如果您确定只有标签和分号,则可以使用split。

If you are positively sure you only have tabs and semicolons, then you can use split.

with open('/tmp/test.tsv') as infile, open('/tmp/test2.tsv', 'w') as outfile:
    for line in infile:
        tsplit = line.split("\t")
        firstcolumns = tsplit[:-1]
        lastitems = tsplit[-1].strip().split(";")
        for item in lastitems:
            allcolumns = firstcolumns + item.split("=")
            outfile.write("\t".join(allcolumns) + "\n")

(更新以便与其他答案。)

(Updated to make it easier to compare with the other answer.)

无论您在最后一列中有多少个分号分隔的项目,这将工作。但是,这对格式的小变化(例如添加空格)很敏感。

This will work regardless of the number of semicolon-separated items you have in the last column. However, this is sensitive to small changes in the format (e.g. added spaces).

这篇关于从tsv文件python提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆