自定义索引的数据结构 [英] Data structure for custom indexing

查看:188
本文介绍了自定义索引的数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望编写一个数据结构来表示一些遗传数据。此数据可以表示为大小 n 的列表,其中每个条目还具有遗传位置,这是0到1之间的实数。为了使命名清晰,我将调用列表 id 中的位置和遗传位置 gpos 。我实现这个的方式是一个类

I am looking to write a data structure to represent some genetic data. This data can be represented as a list of size n, where each entry also has a "genetic position" which is a real number between 0 and 1. To make nomenclature clear, I'll call the position in the list id and the genetic position gpos. The way I implemented this is as a class with

class Coords(object):

    def __init__(self, *args, **kwargs):
        self.f = list(*args, **kwargs)
        self.r = dict()
        for i,e in enumerate(self.f):
            self.r[e] = i

    def __setitem__(self,x,y):
        self.f.__setitem__(x,y)
        self.r.__setitem__(y,x)

    def __getitem__(self,x):
        return self.f.__getitem__(x)

    def __len__(self):
        return self.f.__len__()

现在,我这有两个问题。第一个是self.r的浮标是浮点数,这显然是一个坏主意。我在考虑将它们转换为字符串(具有固定数字的数字),但有更好的想法吗?我的另一个问题是我想通过 gpos 实现访问条目,所以如果我想访问 gpos <之间的所有内容/ code> 0.2和0.4,我希望能够使用

now, I have two issues with this. The first one is that the indeces of self.r are floats, which is obviously a bad idea. I was thinking of converting them to strings (with a fixed number of digits), but is there a better idea? The other issue I have is that I want to implement accessing entries via gpos, so if I, for example, would like to access everything between gpos 0.2 and 0.4, I would like to be able to do that using

import numpy as np
Coords(np.arange(1,0,-.1))
c.r[0.2:0.4]

有一种简单的方法可以定义吗?我正在考虑使用二进制搜索找到正确的 id 开始和结束位置,然后使用 self.f 使用这些ID,但有没有办法实现上面的语法?

is there an easy way to define that? I was thinking of finding the correct id of the starting and ending positions using binary search and then access self.f using these ids, but is there a way to implement above syntax?

推荐答案

当您使用切片索引对象时,Python会创建一个带有您提供的输入的 slice 对象。例如,如果您执行 c [0.2:0.4] ,则传递给 c .__ getitem __ 的参数将为切片(0.2,0.4)。所以你可以在你的 __ getitem __ 方法中使用这样的代码:

When you index an object with a slice, Python creates a slice object with the inputs you provide. For example, if you do c[0.2:0.4], then the argument passed to c.__getitem__ will be slice(0.2, 0.4). So you could have something like this code in your __getitem__ method:

def __getitem__(self, x):
    if isinstance(x, slice):
        start = x.start
        stop = x.stop
        step = x.step
        # Do whatever you want to do to define your return
    ...

如果你想要使用这个花哨的索引不在 Coords 对象上,但在 self.r 字典中,我觉得最简单将创建一个 FancyIndexDict ,它是 dict 的子类,修改其 __ getitem __ 方法,然后 self.r 是一个 FancyIndexDict ,而不是 dict

If you want to use this fancy indexing not on the Coords object, but in the self.r dictionary, I think the easiest would be to create a FancyIndexDict that is a subclass of dict, modify its __getitem__ method, and then have self.r be a FancyIndexDict, not a dict.

这篇关于自定义索引的数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆