在范围的字典中找到一个值 - python [英] Finding a value within a dictionary of ranges - python
问题描述
例
文件1:
A 200 900
A 1000 1200
B 100 700
B 900 1000
文件2: p>
A 103
A 200
A 250
B 50
B 100
B 150
我想从第二个文件中找到包含在范围内的所有值在第一个文件中找到,以便我的输出如下所示:
A 200
A 250
B 100
B 150
现在我已经从第一个文件创建了一个列表,的范围:
Ex。
如果字典中的标识符:
字典[标识符] .extend(范围(开始,(结束+ 1)))
else:
字典[标识符] =范围(开始,(结束+ 1))
然后我浏览第二个文件并搜索范围字典中的值:
E x。
如果字典中的标识符:
如果字典中的值[标识符]:
OutFile.write (Line +\\\
)
虽然不是最佳的,但对于较小的文件,有几个大文件,这个程序证明是非常低效的。我需要优化我的程序,以便它运行得更快。
import defaultdict
ident_ranges = defaultdict(list)
with open('file1.txt','r')as f1
for the row in f1:
ident,start,end = row.split()
start,end = int(start),int(end)
ident_ranges [ident] .append((start,end))
with open('file2.txt ','r')为f2,打开('out.txt','w')作为输出:
为f2中的行
ident,value = line.split()
value = int(value)
如果有的话(start< = value< = end for start,end in ident_ranges [ident]):
output.write(line)
注意:使用 defaultdict
将范围添加到您的字典中,而无需首先检查键的存在。使用任何
允许范围检查短路。使用链接的比较是一个很好的Python语法快捷方式( start< = value< = end
)。
I'm comparing 2 files with an initial identifier column, start value, and end value. The second file contains corresponding identifiers and another value column.
Ex.
File 1:
A 200 900
A 1000 1200
B 100 700
B 900 1000
File 2:
A 103
A 200
A 250
B 50
B 100
B 150
I would like to find all values from the second file that are contained within the ranges found in the first file so that my output would look like:
A 200
A 250
B 100
B 150
For now I have created a dictionary from the first file with a list of ranges: Ex.
if Identifier in Dictionary:
Dictionary[Identifier].extend(range(Start, (End+1)))
else:
Dictionary[Identifier] = range(Start, (End+1))
I then go through the second file and search for the value within the dictionary of ranges: Ex.
if Identifier in Dictionary:
if Value in Dictionary[Identifier]:
OutFile.write(Line + "\n")
While not optimal this works for relatively small files, however I have several large files and this program is proving terribly inefficient. I need to optimize my program so that it will run much faster.
from collections import defaultdict
ident_ranges = defaultdict(list)
with open('file1.txt', 'r') as f1
for row in f1:
ident, start, end = row.split()
start, end = int(start), int(end)
ident_ranges[ident].append((start, end))
with open('file2.txt', 'r') as f2, open('out.txt', 'w') as output:
for line in f2:
ident, value = line.split()
value = int(value)
if any(start <= value <= end for start, end in ident_ranges[ident]):
output.write(line)
Notes: Using a defaultdict
allows you to add ranges to your dictionary without first checking for the existence of a key. Using any
allows for short circuiting of the range check. Using chained comparision is a nice Python syntactic shortcut (start <= value <= end
).
这篇关于在范围的字典中找到一个值 - python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!