Python-将文本文件解析为csv文件 [英] Python - Parsing a text file into a csv file

查看:88
本文介绍了Python-将文本文件解析为csv文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文本文件,该文件是从与Netmiko一起运行的命令输出的,该命令用于从Cisco WLC检索对WiFi网络造成干扰的数据。我从原来的60万行代码中删除了所需的内容,然后减少到几千行,例如:

I have a text file that is output from a command that I ran with Netmiko to retrieve data from a Cisco WLC of things that are causing interference on our WiFi network. I stripped out just what I needed from the original 600k lines of code down to a couple thousand lines like this:

AP Name.......................................... 010-HIGH-FL4-AP04
Microwave Oven      11       10      -59         Mon Dec 18 08:21:23 2017   
WiMax Mobile               11       0       -84         Fri Dec 15 17:09:45 2017   
WiMax Fixed                11       0       -68         Tue Dec 12 09:29:30 2017   
AP Name.......................................... 010-2nd-AP04
Microwave Oven             11       10      -61         Sat Dec 16 11:20:36 2017   
WiMax Fixed                11       0       -78         Mon Dec 11 12:33:10 2017   
AP Name.......................................... 139-FL1-AP03
Microwave Oven             6        18      -51         Fri Dec 15 12:26:56 2017   
AP Name.......................................... 010-HIGH-FL3-AP04
Microwave Oven             11       10      -55         Mon Dec 18 07:51:23 2017   
WiMax Mobile               11       0       -83         Wed Dec 13 16:16:26 2017   

目标是最终得到一个csv文件,该文件将去除 AP名称...,并将剩下的信息与其余信息放在同一行中。问题是某些名称在AP名称下面有两行,而有些则没有或只有1行。我已经呆了8个小时,无法找到实现此目标的最佳方法。

The goal is to end up with a csv file that strips out the 'AP Name ...' and puts what left on the same line as the rest of the information in the next line. The problem is some have two lines below the AP name and some have 1 or none. I have been at it for 8 hours and cannot find the best way to make this happen.

这是我尝试使用的最新版本的代码,对此有何建议?我只想要一些我可以在excel中加载并创建报告的文件:

This is the latest version of code that I was trying to use, any suggestions for making this work? I just want something I can load up in excel and create a report with:

with open(outfile_name, 'w') as out_file:
    with open('wlc-interference_raw.txt', 'r')as in_file:
        #Variables
        _ap_name = ''
        _temp = ''
        _flag = False
        for i in in_file:
            if 'AP Name' in i:
                #write whatever was put in the temp file to disk because new ap now
                #add another temp variable in case an ap has more than 1 interferer and check if new AP name
                out_file.write(_temp)
                out_file.write('\n')
                #print(_temp)
                _ap_name = i.lstrip('AP Name.......................................... ')
                _ap_name = _ap_name.rstrip('\n')
                _temp = _ap_name
                #print(_temp)
            elif '----' in i:
                pass
            elif 'Class Type' in i:
                pass
            else:
                line_split = i.split()
                for x in line_split:
                    _temp += ','
                    _temp += x
                _temp += '\n'


推荐答案

我认为您最好的选择是读取文件的所有行,然后将其分成以AP名称开头的部分。然后您可以分析每个部分。

I think your best option is to read all lines of the file, then split into sections starting with AP Name. Then you can work on parsing each section.

s = """AP Name.......................................... 010-HIGH-FL4-AP04
Microwave Oven      11       10      -59         Mon Dec 18 08:21:23 2017   
WiMax Mobile               11       0       -84         Fri Dec 15 17:09:45 2017   
WiMax Fixed                11       0       -68         Tue Dec 12 09:29:30 2017   
AP Name.......................................... 010-2nd-AP04
Microwave Oven             11       10      -61         Sat Dec 16 11:20:36 2017   
WiMax Fixed                11       0       -78         Mon Dec 11 12:33:10 2017   
AP Name.......................................... 139-FL1-AP03
Microwave Oven             6        18      -51         Fri Dec 15 12:26:56 2017   
AP Name.......................................... 010-HIGH-FL3-AP04
Microwave Oven             11       10      -55         Mon Dec 18 07:51:23 2017   
WiMax Mobile               11       0       -83         Wed Dec 13 16:16:26 2017"""

import re

class AP:
    """ 
    A class holding each section of the parsed file
    """
    def __init__(self):
        self.header = ""
        self.content = []

sections = []
section = None
for line in s.split('\n'):  # Or 'for line in file:'
    # Starting new section
    if line.startswith('AP Name'):
        # If previously had a section, add to list
        if section is not None:
            sections.append(section)  
        section = AP()
        section.header = line
    else:
        if section is not None:
            section.content.append(line)
sections.append(section)  # Add last section outside of loop


for section in sections:
    ap_name = section.header.lstrip("AP Name.")  # lstrip takes all the characters given, not a literal string
    for line in section.content:
        print(ap_name + ",", end="") 
        # You can extract the date separately, if needed
        # Splitting on more than one space using a regex
        line = ",".join(re.split(r'\s\s+', line))
        print(line.rstrip(','))  # Remove trailing comma from imperfect split



输出



Output

010-HIGH-FL4-AP04,Microwave Oven,11,10,-59,Mon Dec 18 08:21:23 2017
010-HIGH-FL4-AP04,WiMax Mobile,11,0,-84,Fri Dec 15 17:09:45 2017
010-HIGH-FL4-AP04,WiMax Fixed,11,0,-68,Tue Dec 12 09:29:30 2017
010-2nd-AP04,Microwave Oven,11,10,-61,Sat Dec 16 11:20:36 2017
010-2nd-AP04,WiMax Fixed,11,0,-78,Mon Dec 11 12:33:10 2017
139-FL1-AP03,Microwave Oven,6,18,-51,Fri Dec 15 12:26:56 2017
010-HIGH-FL3-AP04,Microwave Oven,11,10,-55,Mon Dec 18 07:51:23 2017
010-HIGH-FL3-AP04,WiMax Mobile,11,0,-83,Wed Dec 13 16:16:26 2017

T ip:

您不需要Python来编写CSV,您可以使用命令行将其输出到文件中

You don't need Python to write the CSV, you can output to a file using the command line

python script.py > output.csv

这篇关于Python-将文本文件解析为csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆