Python-在txt文件中按升序排序 [英] Python - Sorting in ascending order in a txt file

查看:327
本文介绍了Python-在txt文件中按升序排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个巨大的文档,使用正则表达式进行了解析,以提供类似于以下内容的txt文件(json.dump):

I had a huge document that I parsed using regex to give a txt file (json.dump) similar to the following:

{
    "stuff": [
        {
            "name": [
                "frfer", 
                "niddsi", 
            ], 
            "number": 11300, 
            "identifier": "Tsdsad"
        }, 
        {
            "name": [
                "Fast", 
                "Guard", 
                "Named", 
            ], 
            "number": 117900, 
            "identifier": "Pdfms"
        }, 
        {
            name: [
                "Fast", 
            ], 
            "number": 660, 
            "identifier": "Unnamed"
        },    
    ]
}    

现在,我想根据编号以升序对文档进行排序. (即"Pdfms"第一,"Tsdsad"第二,未命名"第三).我不确定如何从python开始,有人可以向我指出正确的方向吗?预先感谢

Now I would like to sort this document in ascending order based on the number. (i.e. "Pdfms" first, "Tsdsad" second, "Unnamed" third). I am unsure how to start this off in python, could anyone give me a point in the right direction? Thanks in advance

推荐答案

第一个问题:这不是合法的JSON.源中有多余的逗号(JSON不喜欢[a,b,c,];它坚持使用[a,b,c]),并且有一些未引用的标识符(例如,name的第三个实例).理想情况下,您将改善初始文本文件的解析和JSON化以解决这些问题.或者,您可以像这样即时处理这些修正:

First problem: That's not legitimate JSON. You have extra commas (JSON doesn't like [a,b,c,]; it insists on [a,b,c]) in the source, and you have some identifiers (the third instance of name, e.g.) that are not quoted. Ideally, you will improve your initial text file parsing and JSONification to fix those issues. Or you can handle those fixups on the fly, like this:

json_source = """
    ... your text data from above ...
"""

import re
BADCOMMA = re.compile(r',\s+\]')
json_source = BADCOMMA.sub(']', json_source)

BADIDENTIFIER = re.compile(r'\s+name:\s*')
json_source = BADIDENTIFIER.sub('"name":', json_source)

当心,假设您可以即时解决所有可能的问题,那就是脆弱模式.同样,也可以通过正则表达式编辑结构化数据文件.最好从一开始就生成良好的JSON.

Beware, assuming you can fix every possible problem on the fly is a fragile pattern. Editing structured data files via regular expressions, likewise. Better to generate good JSON from the get-go.

现在,如何排序:

import json
data = json.loads(json_source)

data['stuff'].sort(key=lambda item: item['number'], reverse=True)

通过数字"值对填充"数组进行就地排序,并将其反转(因为您希望输出的示例显示的是降序而不是典型的升序).

That does an in-place sort of the "stuff" array, by the "number" value, and reverses it (because your example of how you want the output suggests a descending rather than the typical ascending sort).

要证明排序已完成您想要的操作,pprint模块可以方便使用:

To demonstrate that the sort has done what you want, the pprint module can be handy:

from pprint import pprint
pprint(data)

收益:

{u'stuff': [{u'identifier': u'Pdfms',
             u'name': [u'Fast', u'Guard', u'Named'],
             u'number': 117900},
            {u'identifier': u'Tsdsad',
             u'name': [u'frfer', u'niddsi'],
             u'number': 11300},
            {u'identifier': u'Unnamed', u'name': [u'Fast'], u'number': 660}]}

这篇关于Python-在txt文件中按升序排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆