从特殊格式的文本文件读取数据 [英] Reading data from specially formatted text file
问题描述
我正在使用这种方法,由Ashwini Chaudhary建议,从特定格式的文本文件中将数据分配给字典。
keys = map(str.strip,next(f).split('Key\t')[1] .split('\t'))
words = map .strip,next(f).split('Word\t')[1] .split('\t'))
文本文件的行标题后跟值,用 \t
字符分隔。
示例1:
键a 1 b 2 c 3 d 4
Word为框母牛dig
如何更改我的代码不读取文件中的所有行,但只有特定的?我不想阅读的额外行数应该被忽略:
示例2 - 忽略 LineHere
和 OrHere
行:
LineHere wxyz
键a 1 b 2 c 3 d 4
OrHere 00 01 10 11
以字母为单位的牛挖
或者如果我想要阅读一条标题为WordXOR'Letter的行,无论哪一个都在文件中。因此,扫描示例1或2的代码也适用于:
示例3 - 我想要阅读 Key
和信
行:
LineHere wxyz
键1 b 2 c 3 d 4
OrHere 00 01 10 11
信ABCD
请随时对问题的批评进行评论,我很乐意重新表述/澄清这个问题。
作为参考,前身question 在这里链接
非常感谢,
Alex
如下所示:
import re
/ pre>
with open abc')作为f:
在f中的行
如果line.startswith('Key'):
keys = re.search(r'Key\s +(。*)' ,线)。集团(1).spli t(\t)
elif line.startswith(('Word','Letter')):
vals = re.search(r'(Word | Letter)\s + *)',line).group(2).split(\t)
print dict(zip(keys,vals))
abc :
LineHere wxyz
键a 1 b 2 c 3 d 4
OrHere 00 01 10 11
Word as box cow dig
输出是:
{'d 4':'dig',' b 2':'box','a 1':'as','c 3':'cow'}
abc :
LineHere wxyz
键a 1 b 2 c 3 d 4
OrHere 00 01 10 11
信函ABCD
输出是:
{'d 4':'D','b 2':'B','a 1' ','c 3':'C'}
I am using this method, kindly suggested by Ashwini Chaudhary, to assign data to a dictionary from a text file that is in a specific format.
keys = map(str.strip, next(f).split('Key\t')[1].split('\t')) words = map(str.strip, next(f).split('Word\t')[1].split('\t'))
The text file has the row title followed by values, separated by a
\t
character.Example 1:
Key a 1 b 2 c 3 d 4 Word as box cow dig
How would I change my code not to read all the lines in a file, but only specific ones? Extra Lines which I do not want to read should just be ignored:
Example 2 - ignore
LineHere
andOrHere
rows:LineHere w x y z Key a 1 b 2 c 3 d 4 OrHere 00 01 10 11 Word as box cow dig
Or if I wanted to have the possibility of reading a line titled 'Word' XOR 'Letter', whichever one happens to be in the file. So the code to scan Examples 1 or 2 would also be valid for:
Example 3 - I want to read
Key
andLetter
lines:LineHere w x y z Key a 1 b 2 c 3 d 4 OrHere 00 01 10 11 Letter A B C D
Please feel free to comment with question criticisms and I'll be happy to rephrase/clarify the question.
As a reference, the precursor question is linked here
Many thanks,
Alex
解决方案Something like this:
import re with open('abc') as f: for line in f: if line.startswith('Key'): keys = re.search(r'Key\s+(.*)',line).group(1).split("\t") elif line.startswith(('Word','Letter')): vals = re.search(r'(Word|Letter)\s+(.*)',line).group(2).split("\t") print dict(zip(keys,vals))
abc:
LineHere w x y z Key a 1 b 2 c 3 d 4 OrHere 00 01 10 11 Word as box cow dig
output is :
{'d 4': 'dig', 'b 2': 'box', 'a 1': 'as', 'c 3': 'cow'}
abc:
LineHere w x y z Key a 1 b 2 c 3 d 4 OrHere 00 01 10 11 Letter A B C D
output is :
{'d 4': 'D', 'b 2': 'B', 'a 1': 'A', 'c 3': 'C'}
这篇关于从特殊格式的文本文件读取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!