使用Python操纵键值分组的txt文件表示形式 [英] Use Python to manipulate txt file presentation of key-value grouping
问题描述
我试图使用Python来处理格式A中的文本文件:
I am trying to use Python in order to manipulate a text file from Format A:
Key1
Key1value1
Key1value2
Key1value3
Key2
Key2value1
Key2value2
Key2value3
Key3...
转换为格式B:
Key1 Key1value1
Key1 Key1value2
Key1 Key1value3
Key2 Key2value1
Key2 Key2value2
Key2 Key2value3
Key3 Key3value1...
具体来说,这是文件本身的简要介绍(仅显示一个键,整个文件中还有数千个键):
Specifically, here is a brief look at the file itself (only one key shown, thousands more in the full file):
chr22:16287243: PASS
patientID1 G/G
patientID2 G/G
patient ID3 G/G
和所需的输出在这里:
chr22:16287243: PASS patientID1 G/G
chr22:16287243: PASS patientID2 G/G
chr22:16287243: PASS patientID3 G/G
我已经编写了以下代码,可以检测/显示键,但是我在编写代码以存储与每个键关联的值时遇到麻烦,随后无法打印这些键值对.谁能帮我完成这个任务?
I've written the following code which can detect/display the keys, but I am having trouble writing the code to store the values associated with each key, and subsequently printing these key-value pairs. Can anyone please assist me with this task?
import sys
import re
records=[]
with open('filepath', 'r') as infile:
for line in infile:
variant = re.search("\Achr\d",line, re.I) # all variants start with "chr"
if variant:
records.append(line.replace("\n",""))
#parse lines until a new variant is encountered
for r in records:
print (r)
推荐答案
一次完成,无需存储行:
Do it in one pass, without storing the lines:
with open("input") as infile, open("ouptut", "w") as outfile:
for line in infile:
if line.startswith("chr"):
key = line.strip()
else:
print >> outfile, key, line.rstrip("\n")
此代码假定第一行包含一个键,否则将失败.
This code assumes the first line contains a key and will fail otherwise.
这篇关于使用Python操纵键值分组的txt文件表示形式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!