将特殊字符串转换为整数的方法 [英] Ways to convert special-purpoes-strings to Integers

查看:102
本文介绍了将特殊字符串转换为整数的方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要具有键-值对的内存中数据结构(价值400 MB的数据)。我对键有以下约束:

I need to have a in-memory data-structure of Key-Value pair (400 MB worth of data). I have following constraints on keys:


  1. 这两个键和值分别是长度为256和1024
    的文本字符串。

  2. 任何密钥通常看起来像k1k2k3k4k5,每个k(i)本身就是4-8个字节的字符串。某些k(i)可能存在或可能不存在。

  3. 每个k(i)都有6-8种可能性。但是k3和k4有256000种可能性。

  4. 一个人可能会用prefix_key迭代DS。 DS应该为此操作进行优化。此操作分配一个迭代器,即迭代整个DS,并返回与prefix_key匹配的键值列表(例如,如上定义的 k1k2k3。*,k(i))。每次迭代都在此迭代器(列表)上进行迭代。释放迭代器可以释放列表。

  1. Both key and values are text strings of length 256 and 1024 respectively.
  2. Any key generally looks like k1k2k3k4k5, each k(i) being 4-8 byte string in itself. Some k(i) may or may not be there in the keys.
  3. Every k(i) has 6-8 possibilities. However k3 and k4 have 256000 possibilities.
  4. One might iterate the DS with prefix_key. DS should be optimized for this operation. This operation allocates an iterator i.e it iterates the whole DS and returns list of key-values that match prefix_key (e.g. "k1k2k3.*", k(i) defined as above). Every iteration iterates on this iterator(list). Freeing the iterator frees the list.

使用DS来获取字符串键会使键比较过于昂贵。这样就排除了DS的某些选项(哈希,B +树)。

Coming up with DS for string keys makes key comparisons too expensive. And thus certain option for the DS (Hash, B+Tree) get ruled out.

我的问题是我们如何创造性地将字符串键转换为整数键?具有以下属性:

My question is how creatively can we convert String keys to integer keys?The solution needs to have following property:

对于键模式 k1k2k3。*,它应该对整数的上下限进行加权,以便基于这些界限,只有少数几个条目可以在DS中查询。

For a key pattern "k1k2k3.*" it should genarate upper and lower bound on integers so that based on these bounds only handful number of entries are looked up in the DS.

我是在

I am asking this question in context of solution towards this

推荐答案

每个k(i)有6-8种可能性。但是k3和k4有256000种可能性。

Every k(i) has 6-8 possibilities. However k3 and k4 have 256000 possibilities.

如果可以在k1 k2 k3 k4 k5中拆分键,则可以这样编码:

If you can split key in k1 k2 k3 k4 k5, you could encode it like this:

 3 bits for k1  
 3 bits for k2  
18 bits for k3  
18 bits for k4  
 3 bits for k5

,这产生了45位。
因此,您可以将密钥简化为0到2 ^ 45-1之间的整数。
这个接缝特别多,尤其是如果您仅使用k3和k4的一些可能值。

this makes 45bits. So you can boil down your key to an integer between 0 and 2^45-1. This seams to be a lot especially if you only use a few of the possible values for k3 and k4.

所以我将k1的6位k2可以精确地映射到索引,而不是依赖于k3 k4的密度,k3和k4可以采用某种树结构,而不是k5可以精确地映射到索引。

So I would take the 6 bits of k1 k2 for a exact mapping to an index and than depending how dense the k3 k4 is, some kind of tree structue for k3 and k4 and than an exact mapping to an index for k5 again.

这篇关于将特殊字符串转换为整数的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆