将IP地址保存在mongoDB中 [英] save IP address in mongoDB

查看:386
本文介绍了将IP地址保存在mongoDB中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当前,为了保存IP地址,我将其转换为数字并将其存储在集合中.基本上,我这样做是出于日志记录的目的.这意味着我希望以尽可能少的空间来尽可能快地存储信息.

Currently in order to save an IP address I am converting it to number and store it in the collection. Basically I am doing this for logging purposes. This means that I care to store information as fast as possible and with smallest amount of space.

我很少会用它来查询.

我的想法

  • 存储为字符串肯定是无效的.
  • 存储为4位数字会更慢,并且会占用更多空间.

尽管如此,我认为这是一种适当的方法,但是对于我而言,有没有更好的方法呢?

Nonetheless I think that this is an adequate method, but is there a better one for my purpose?

推荐答案

如果您不介意花费大量工作,尤其是如果需要查询地址和地址,则可以将IP地址绝对保存为数字.您有大表/集合.

Definitely save IP addresses as numbers, if you don't mind the extra bit of work that it takes, especially if you need to do queries on the addresses and you have large tables/collections.

这是为什么:

存储

  • 如果以无符号整数形式存储,则IPv4地址为4个字节.
  • 以点八进制形式写为字符串时,IPv4地址在10字节与18字节之间变化. (假设平均长度为14个字节.)

这是7-15个字节的字符,如果您使用的是可变长度的字符串类型,则为2-3个字节,具体取决于所使用的数据库.如果您有固定长度的字符串表示形式,则必须使用15个字符的固定宽度字段.

That is 7-15 bytes for the characters, plus 2-3 bytes if you're using a variable length string type, which varies based on the database you're using. If you have a fixed length string representation available, then you must use a 15-character fixed width field.

磁盘存储很便宜,因此在大多数用例中这并不是一个因素.但是,内存并不便宜,如果您的表/集合很大,并且想进行快速查询,则需要一个索引.字符串编码的2-3倍存储损失极大地减少了您可以索引的记录数量,同时仍将索引驻留在内存中.

Disk storage is cheap, so that's not a factor in most use cases. Memory, however, is not as cheap, and if you have a large table/collection and you want to do fast queries, then you need an index. The 2-3x storage penalty of string encoding drastically reduces the amount of records you can index while still keeping the index resident in memory.

  • 如果以无符号整数形式存储,则IPv6地址为16个字节. (可能是多个4或8个字节的整数,具体取决于您的平台.)
  • 以缩写的十六进制表示为字符串时,IPv6地址的范围为6字节至42字节.

在低端,回送地址(:: 1)为3个字节加上可变长度的字符串开销.在高端,像2002:4559:1FE2:1FE2:4559:1FE2:4559:1FE2这样的地址使用39个字节加上可变长度的字符串开销.

On the low end, a loop back address (::1) is 3 bytes plus the variable length string overhead. On the high end, an address like 2002:4559:1FE2:1FE2:4559:1FE2:4559:1FE2 uses 39 bytes plus the variable length string overhead.

与IPv4不同,假定平均IPv6字符串长度为6和42的平均值是不安全的,因为具有大量连续零的地址数量仅占整个IPv6地址空间的一小部分.只有某些特殊地址(如回送和自动配置地址)可能会以这种方式压缩.

Unlike with IPv4, it's not safe to assume the average IPv6 string length will be mean of 6 and 42, because the number of addresses with a significant number of consecutive zeroes is a very small fraction of the overall IPv6 address space. Only some special addresses, like loopback and autoconf addresses, are likely to be compressible in this way.

同样,与整数编码相比,字符串编码的存储损失为> 2倍.

Again, this is a storage penalty of >2x for string encoding versus integer encoding.

网络数学

您认为路由器将IP地址存储为字符串吗?当然不.

Do you think routers store IP addresses as strings? Of course they don't.

如果需要在IP地址上进行网络数学运算,则字符串表示很麻烦.例如.如果要编写一个查询来搜索特定子网中的所有地址(返回IP地址为10.7.200.104/27的所有记录",则可以通过用整数子网掩码屏蔽整数地址来轻松地做到这一点.(如果将地址存储为字符串,则查询将需要将每一行转换为整数,然后对其进行屏蔽,这要慢几个数量级(按位屏蔽).一个IPv4地址可以使用2个寄存器在几个CPU周期内完成.将字符串转换为整数需要对字符串进行循环.)

If you need to do network math on IP addresses, the string representation is a hassle. E.g. if you want to write a query that searches for all addresses on a specific subnet ("return all records with an IP address in 10.7.200.104/27", you can easily do this by masking an integer address with an integer subnet mask. (Mongo doesn't support this particular query, but most RDBMS do.) If you store addresses as strings, then your query will need to convert each row to an integer, then mask it, which is several orders of magnitude slower. (Bitwise masking for an IPv4 address can be done in a few CPU cycles using 2 registers. Converting a string to an integer requires looping over the string.)

类似地,具有整数地址的范围查询(返回所有记录192.168.1.50和192.168.50.100之间的所有记录")将能够使用索引,而对字符串地址的范围查询则不能.

Similarly, range queries ("return all records all records between 192.168.1.50 and 192.168.50.100") with integer addresses will be able to use indexes, whereas range queries on string addresses will not.

底线

这需要更多的工作,但是却不多(那里有一百万个aton()和ntoa()函数),但是如果您要构建一些严肃而扎实的东西,并希望将来能与之抗衡未来的需求以及可能存在大型数据集的情况,您应该将IP地址存储为整数,而不是字符串.

It takes a little bit more work, but not much (there are a million aton() and ntoa() functions out there), but if you're building something serious and solid and you want to future-proof it against future requirements and the possibility of a large dataset, you should store IP addresses as integers, not strings.

如果您正在快速而肮脏地做某事,并且不介意将来进行重塑,请使用字符串.

If you're doing something quick and dirty and don't mind the possibility of remodeling in the future, then use strings.

出于OP的目的,如果您正在优化速度和空间,并且您不希望经常查询它,那为什么还要使用数据库呢?只需将IP地址打印到文件即可.这将比将其存储在数据库中(具有相关的API和存储开销)更快,存储效率更高.

For the OP's purpose, if you are optimizing for speed and space and you don't think you want to query it often, then why use a database at all? Just print IP addresses to a file. That would be faster and more storage efficient than storing it in a database (with associated API and storage overhead).

这篇关于将IP地址保存在mongoDB中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆