将.txt文件的内容分隔为.csv文件中的多个单元格 [英] Separate the .txt file contents to multiple cells in .csv file

查看：319 发布时间：2020/7/12 3:10:19 python csv parsing text

本文介绍了将.txt文件的内容分隔为.csv文件中的多个单元格的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Python 2.7，我有一个这样的txt文件，我正在用python打开它:

I'm using Python 2.7, i have got a txt file like this which one i'm opening it with python :

TIME    FLIGHT  FROM    AIRLINE AIRCRAFT        STATUS
8:40 AM LH1334  
Frankfurt (FRA)
Lufthansa   A320 (D-AIPP)   
Landed 8:40 AM
8:45 AM OK786   
Prague (PRG)
Czech Airlines  AT45 (OK-KFP)   
Landed 8:32 AM

我想以正确的模式将其导出到csv到6列(时间，航班，发件人，航空公司，飞机，状态)，我想获取此信息:

I want to export it to csv in the correct mode to 6 columns (Time, Flight, From, Airline, Aircraft, Status), i want to get this:

TIME            FLIGHT  FROM            AIRLINE         AIRCRAFT      STATUS
Jul 21 8:40 AM  LH1334  Frankfurt (FRA) Lufthansa   A320 (D-AIPP) Landed 8:40 AM
...

这对我来说有点困难，因为连续有多个单词，所以我没有任何有用的主意，如何知道这种形式.

Its a little bit hard for me, because in a row there are multiple words, so i haven't got any useful idea, how i can reach this form.

我的代码:

import unicodecsv as csv
import os
import sys
import io
import time
import datetime
import pandas as pd

def to_2d(l,n):
    return [l[i:i+n] for i in range(0, len(l), n)]

f = open('proba.txt', 'r')
x = f.read()

filename=r'output.csv'

resultcsv=open(filename,"wb")
output=csv.writer(resultcsv, delimiter=';',quotechar = '"', quoting=csv.QUOTE_NONNUMERIC, encoding='latin-1')

maindatatable = to_2d(x, 6)
print maindatatable
output.writerows(x)

resultcsv.close()

推荐答案

看起来它们被分为4行.

Looks like they're grouped as 4 lines each.

我们可以处理第一行

8:40 AM LH1334

~~如下:~~

~~import re matches = re.match('(\d{1,2}:\d{2} [APM]{2}) (\w+\d+)', line) time = matches.group(1) flight = matches.group(2)~~

编辑:这有点过头了.有一个选项卡将它们分开，因此实际上非常简单:

This bit is overkill. There is a tab separating them, so it's actually very easy:

time, flight = line.split('\t')

第二行:

Frankfurt (FRA)

不重要:

from_ = line

第三行:

Lufthansa   A320 (D-AIPP)

可以处理:

airline, aircraft = line.split('\t')

第四行:

Landed 8:40 AM

也是微不足道的:

status = line

总共，您可以分四行分别处理它们:

Altogether, you can process them in batches of four lines each:

from itertools import islice

with open('my.txt') as f:
    header = f.readline()  # skip header

    while True:
        # read four lines
        lines = list(islice(f, 4))
        if len(lines) < 4:
            break

        time, flight = lines[0].split('\t')
        from_ = lines[1]
        airline, aircraft = lines[2].split('\t')
        status = lines[3]

        # Output a row into your csv file here

这篇关于将.txt文件的内容分隔为.csv文件中的多个单元格的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将.txt文件的内容分隔为.csv文件中的多个单元格 [英] Separate the .txt file contents to multiple cells in .csv file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将.txt文件的内容分隔为.csv文件中的多个单元格 [英] Separate the .txt file contents to multiple cells in .csv file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭