存储百万电话号码 [英] Storing 1 million phone numbers

查看:159
本文介绍了存储百万电话号码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是最有效的方式,内存的角度来看,存储百万的电话号码?

What is the most efficient way, memory-wise, to store 1 million phone numbers?

显然,这是在谷歌的面试问题,请给你的想法。

Apparently this is an interview question at Google, please give your ideas.

推荐答案

如果记忆是我们最大的考虑因素,那么我们就需要存储的数量在所有的,只是我和i + 1之间的增量。

If memory is our biggest consideration, then we don't need to store the number at all, just the delta between i and i+1.

现在,如果号范围从200 0000 - 999 9999,则有7999999可能的电话号码。因为我们有100万的数字,并且如果我们假设它们是均匀分布的,我们有E的预期距离= n_i + 1 - n_i〜连续数字n_i和n_i + 1之间8(3比特)。因此,对于一个32位int,我们可能最多存储10个连续的偏移量(〜400KB最佳的总内存占用),但它有可能,我们还会有一些情况下,我们需要一个偏移大于8(也许我们有400,或1500 ??)。在这种情况下,我们可以简单地保留在第一2位整型的作为标题,它告诉我们什么帧大小,我们使用读它存储各比特。例如,也许我们使用:00 = 3×10,01 = 5×6,10 = 7x4,11 = 1 * 30

Now if numbers range from 200 0000 - 999 9999, then there are 7,999,999 possible phone numbers. Since we have 1-million numbers, and if we assume that they are uniformly distributed, we have an Expected distance of E = n_i+1 - n_i ~ 8 (3 bits) between sequential numbers n_i and n_i+1. So for a 32-bit int we could potentially store up to 10 sequential offsets (~400kb optimal total memory footprint), however its likely that we'll have some cases where we need an offset greater than 8 (Maybe we have 400, or 1500??). In which case, we can simply reserve the first 2 bits of the int as a header which tells us what frame size we use to read the bits it stores. For example, maybe we use: 00 = 3x10, 01 = 5x6, 10 = 7x4, 11 = 1*30.

这篇关于存储百万电话号码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆