按python中的第一(或第二,否则)列对文件排序 [英] Sort a file by first (or second, or else) column in python

查看:308
本文介绍了按python中的第一(或第二,否则)列对文件排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这似乎是一个非常基本的问题,但是我是python的新手,花了很长时间尝试自己寻找解决方案后,我认为该问一些更高级的人了!

This seems a very basic question, but I am new to python, and after spending a long time trying to find a solution on my own, I thought it's time to ask some more advanced people!

所以,我有一个文件(示例):

So, I have a file (sample):

ENSMUSG00000098737  95734911    95734973    3   miRNA
ENSMUSG00000077677  101186764   101186867   4   snRNA
ENSMUSG00000092727  68990574    68990678    11  miRNA
ENSMUSG00000088009  83405631    83405764    14  snoRNA
ENSMUSG00000028255  145003817   145032776   3   protein_coding
ENSMUSG00000028255  145003817   145032776   3   processed_transcript
ENSMUSG00000028255  145003817   145032776   3   processed_transcript
ENSMUSG00000098481  38086202    38086317    13  miRNA
ENSMUSG00000097075  126971720   126976098   7   lincRNA
ENSMUSG00000097075  126971720   126976098   7   lincRNA

我需要编写一个包含所有相同信息的新文件,但按第一列排序.

and I need to write a new file with all the same information, but sorted by the first column.

到目前为止,我使用的是:

What I use so far is :

lines = open(my_file, 'r').readlines()
output = open("intermediate_alphabetical_order.txt", 'w')

for line in sorted(lines, key=itemgetter(0)):
    output.write(line)

output.close()

它不会返回任何错误,而只是将输出文件写成与输入文件完全一样.

It doesn't return me any error, but just writes the output file exactly as the input file.

我知道这肯定是一个非常基本的错误,但是如果你们中的某些人可以告诉我我做错了,那将是惊人的!

I know it is certainly a very basic mistake, but it would be amazing if some of you could tell me what I'm doing wrong!

非常感谢!

我在打开文件时遇到了麻烦,因此有关已打开的数组的答案并没有真正的帮助.

I am having trouble with the way I open the file, so the answers concerning already opened arrays don't really help.

推荐答案

您遇到的问题是您没有将每一行都变成一个列表.读文件时,您只是将整行作为字符串获取.然后,您将按每行的第一个字符进行排序,并且输入中的该字符始终与'E'相同.

The problem you're having is that you're not turning each line into a list. When you read in the file, you're just getting the whole line as a string. You're then sorting by the first character of each line, and this is always the same character in your input, 'E'.

要仅按第一列进行排序,您需要将第一块分开,然后阅读该部分.所以你的钥匙应该是这样:

To just sort by the first column, you need to split the first block off and just read that section. So your key should be this:

for line in sorted(lines, key=lambda line: line.split()[0]):

split会将您的行变成一个列表,然后从该列表中提取第一列.

split will turn your line into a list, and then the first column is taken from that list.

这篇关于按python中的第一(或第二,否则)列对文件排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆