自定义索引的数据结构 [英] Data structure for custom indexing
问题描述
我希望编写一个数据结构来表示一些遗传数据。此数据可以表示为大小 n
的列表,其中每个条目还具有遗传位置,这是0到1之间的实数。为了使命名清晰,我将调用列表 id
中的位置和遗传位置 gpos
。我实现这个的方式是一个类
I am looking to write a data structure to represent some genetic data. This data can be represented as a list of size n
, where each entry also has a "genetic position" which is a real number between 0 and 1. To make nomenclature clear, I'll call the position in the list id
and the genetic position gpos
. The way I implemented this is as a class with
class Coords(object):
def __init__(self, *args, **kwargs):
self.f = list(*args, **kwargs)
self.r = dict()
for i,e in enumerate(self.f):
self.r[e] = i
def __setitem__(self,x,y):
self.f.__setitem__(x,y)
self.r.__setitem__(y,x)
def __getitem__(self,x):
return self.f.__getitem__(x)
def __len__(self):
return self.f.__len__()
现在,我这有两个问题。第一个是self.r的浮标是浮点数,这显然是一个坏主意。我在考虑将它们转换为字符串(具有固定数字的数字),但有更好的想法吗?我的另一个问题是我想通过 gpos
实现访问条目,所以如果我想访问 gpos <之间的所有内容/ code> 0.2和0.4,我希望能够使用
now, I have two issues with this. The first one is that the indeces of self.r are floats, which is obviously a bad idea. I was thinking of converting them to strings (with a fixed number of digits), but is there a better idea? The other issue I have is that I want to implement accessing entries via gpos
, so if I, for example, would like to access everything between gpos
0.2 and 0.4, I would like to be able to do that using
import numpy as np
Coords(np.arange(1,0,-.1))
c.r[0.2:0.4]
有一种简单的方法可以定义吗?我正在考虑使用二进制搜索找到正确的 id
开始和结束位置,然后使用 self.f
使用这些ID,但有没有办法实现上面的语法?
is there an easy way to define that? I was thinking of finding the correct id
of the starting and ending positions using binary search and then access self.f
using these ids, but is there a way to implement above syntax?
推荐答案
当您使用切片索引对象时,Python会创建一个带有您提供的输入的 slice
对象。例如,如果您执行 c [0.2:0.4]
,则传递给 c .__ getitem __
的参数将为切片(0.2,0.4)
。所以你可以在你的 __ getitem __
方法中使用这样的代码:
When you index an object with a slice, Python creates a slice
object with the inputs you provide. For example, if you do c[0.2:0.4]
, then the argument passed to c.__getitem__
will be slice(0.2, 0.4)
. So you could have something like this code in your __getitem__
method:
def __getitem__(self, x):
if isinstance(x, slice):
start = x.start
stop = x.stop
step = x.step
# Do whatever you want to do to define your return
...
如果你想要使用这个花哨的索引不在 Coords
对象上,但在 self.r
字典中,我觉得最简单将创建一个 FancyIndexDict
,它是 dict
的子类,修改其 __ getitem __
方法,然后 self.r
是一个 FancyIndexDict
,而不是 dict
。
If you want to use this fancy indexing not on the Coords
object, but in the self.r
dictionary, I think the easiest would be to create a FancyIndexDict
that is a subclass of dict
, modify its __getitem__
method, and then have self.r
be a FancyIndexDict
, not a dict
.
这篇关于自定义索引的数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!