如何在Python中从空格分隔的文件中提取特定的列? [英] How to extract specific columns from a space separated file in Python?

查看：413 发布时间：2020/11/2 21:46:25 python extract pdb

本文介绍了如何在Python中从空格分隔的文件中提取特定的列?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试处理蛋白质数据库中的文件，该文件之间用空格(不是\ t)分隔.我有一个.txt文件，我想提取特定的行，并且我只想从那几行中提取几列.

I'm trying to process a file from the protein data bank which is separated by spaces (not \t). I have a .txt file and I want to extract specific rows and, from that rows, I want to extract only a few columns.

我需要用Python做到这一点.我首先尝试使用命令行，并使用awk命令没有问题，但是我不知道如何在Python中执行同样的操作.

I need to do it in Python. I tried first with command line and used awk command with no problem, but I have no idea of how to do the same in Python.

这是我文件的摘录:


[...]
SEQRES   6 B   80  ALA LEU SER ILE LYS LYS ALA GLN THR PRO GLN GLN TRP          
SEQRES   7 B   80  LYS PRO                                                      
HELIX    1   1 THR A   68  SER A   81  1                                  14    
HELIX    2   2 CYS A   97  LEU A  110  1                                  14    
HELIX    3   3 ASN A  122  SER A  133  1                                  12    
[...]

例如，我只想使用"HELIX"行，然后是第4、6、7和9列.我开始使用for循环逐行读取文件，然后提取那些以'HELIX'开头的行……仅此而已.

For example, I'd like to take only the 'HELIX' rows and then the 4th, 6th, 7th and 9th columns. I started reading the file line by line with a for loop and then extracted those rows starting with 'HELIX'... and that's all.

这是我现在拥有的代码，但是打印不能正常工作，仅打印每个块的第一行(HELIX SHEET和DBREF)

This is the code I have right now, but the print doesn't work properly, only prints the first line of each block (HELIX SHEET AND DBREF)

#!/usr/bin/python
import sys

for line in open(sys.argv[1]):
 if 'HELIX' in line:
   helix = line.split()
 elif 'SHEET'in line:
   sheet = line.split()
 elif 'DBREF' in line:
   dbref = line.split()

print (helix), (sheet), (dbref)

推荐答案

如果您已经提取了该行，则可以使用line.split()对其进行拆分.这将为您提供一个列表，您可以从中提取所需的所有元素:

If you already have extracted the line, you can split it using line.split(). This will give you a list, of which you can extract all the elements you need:

>>> test='HELIX 2 2 CYS A 97'
>>> test.split()
['HELIX', '2', '2', 'CYS', 'A', '97']
>>> test.split()[3]
'CYS'

这篇关于如何在Python中从空格分隔的文件中提取特定的列?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在Python中从空格分隔的文件中提取特定的列? [英] How to extract specific columns from a space separated file in Python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在Python中从空格分隔的文件中提取特定的列? [英] How to extract specific columns from a space separated file in Python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭