如何在文件中一次仅将7行换行到列 [英] How to transpose lines to column for only 7 rows at a time in file

查看:62
本文介绍了如何在文件中一次仅将7行换行到列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请帮助,我有一个文本文件,看起来像这样:

Please help, I have a text file that looks something like this:

ID: 000001
Name: John Smith
Email: jsmith@ibm.com
Company: IBM
blah1: a
blah2: b
blah3: c
ID: 000002
Name: Jane Doe
Email: jdoe@ibm.com
Company: IBM
blah1: a
blah2: b
blah3: c
ID:000003
.
.
.
etc.

请注意,每个客户的信息都位于7行中. ID:000002标记下一个客户的开始,000003标记下一个客户,依此类推.

Notice that each customer's info is in 7 rows. The ID:000002 marks the start of the next customer, 000003 the next customer, so on and so forth.

我希望我的输出文件是这样的(而不是下一行中的每个客户的数据,而是将每个ID和随后的7行转换为列):

I would like my output file to be like this (instead of each customer's data in the next rows, to have each ID and subsequent 7 rows to be transposed to columns):

ID: 000001,Name: John Smith,Email: jsmith@ibm.com,Company: IBM, blah1: a,blah2: b,blah3: c
ID: 000002,Name: Jane Doe,Email: jdoe@ibm.com,Company: IBM,blah1: a,blah2: b,blah3: c

我不确定这是否是最简单的技术,我尝试使用列表,但这似乎不适用于我的目的.我知道我的代码并不优雅,但这只是为了自动化我自己和另一个人的一些数据操作.只要可行,我真的不需要任何时尚的东西.

I am not sure if this is the easiest technique, I tried using list but this doesn't seem to work for my purpose. I know my code is not elegant but this is just for automating some data manipulation my myself and one other person. I don't really need anything that's stylish, as long as it works.

#!/usr/bin/python
# open file
input = open ("C:\Documents\Customer.csv","r")

#write to a new file
output = open("C:\Documents\Customer1.csv","w")

#Read whole file into data
data = input.readlines()
list = []
for line in data:
if "User Id:" in line:
    list.append(line)
if "User Email:" in line:
    list.append(line)
if "Company:" in line:
    list.append(line)   
if "Contact Id:" in line:
    list.append(line)
if "Contact Name:" in line:
    list.append(line)
if "Contact Email:" in line:
    list.append(line)
    print list
    import os
    output.write("\n".join(list))
# Close the file
input.close()
output.close()

我的输出文件包含转义字符,并且多次添加了一些客户.

My output file contains escape characters and some customers are added more than once.

推荐答案

为什么您的代码和输入文件不同?您有"ID:"与用户ID:",电子邮件"与用户电子邮件:",等等.嗯,无论如何,您可以这样:

Why does your code and input file differ? You have "ID:" vs "User Id:", "Email" vs "User Email:", etc..? Well anyways, you can do like this:

#!/usr/bin/python

# open file
input = open ("C:\Documents\Customer.csv","r")

#write to a new file
output = open("C:\Documents\Customer1.csv","w")

lines = [line.replace('\n',',') for line in input.split('ID:')]
output.write("\nID:".join(lines)[1:])

# Close files
input.close()
output.close()

或者,如果您完全想针对特定字段进行过滤,以防出现其他情况,例如:

Or, if you totally want to filter for specific fields in case something else pops in, like this:

#!/usr/bin/python

#import regex module
import re

# open input file
input = open ("C:\Documents\Customer.csv","r")

#open output file
output = open("C:\Documents\Customer1.csv","w")

#create search string
search = re.compile(r"""
                        ID:\s\d+|
                        Name:\s\w+\s\w+|
                        Email:\s\w+\@\w+\.\w+|
                        Company:\s\w+|
                        blah1:\s\w+|
                        blah2:\s\w+|
                        blah3:\s\w+
                        """, re.X)

#write to output joining parts with ',' and adding Newline before IDs
output.write(",".join(search.findall(input.read())).replace(',ID:','\nID:'))

# Close files
input.close()
output.close()

记笔记,在上一个示例中,每人不必具有7个字段:)

Take a note, in the last example it doesn't have to have 7 fields per person :)

现在删除了重复项(不保留订单,并比较完整记录):

And now with duplicates removed (order is not kept, and complete record is compared):

#!/usr/bin/python

#import regex module
import re

# open input file
input = open ("C:\Documents\Customer.csv","r")

#open output file
output = open("C:\Documents\Customer1.csv","w")

#create search string
search = re.compile(r"""
                        ID:\s\d+|
                        Name:\s\w+\s\w+|
                        Email:\s\w+\@\w+\.\w+|
                        Company:\s\w+|
                        blah1:\s\w+|
                        blah2:\s\w+|
                        blah3:\s\w+
                        """, re.X)

# create data joining parts with ',' and adding Newline before IDs    
data = ",".join(search.findall(input.read())).replace(',ID:','\nID:')

# split data into list 
# removing duplicates out of strings with set() and joining result back
# together for the output

output.write("\n".join(set(data.split('\n'))))

# Close files
input.close()
output.close()

这篇关于如何在文件中一次仅将7行换行到列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆