在MySQL中为搜索引擎存储记录时处理日期时间的最佳方法? [英] Best way to deal with datetime when storing records for a search engine in MySQL?

查看:98
本文介绍了在MySQL中为搜索引擎存储记录时处理日期时间的最佳方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张桌子:

URL | last_crawled | worker_id | worker_assign_date

该网址显然是任何网址. last_crawled是一个日期字段,用于存储上次对该字段进行爬网的时间.工作人员ID是当前分配给此URL的工作人员的ID.而worker_assign_date是将URL分配给搜寻器的时间.这样一来,如果工作人员X花很长时间来爬网URL,我将把URL分配回另一个工作人员.

The url is obviously any url. The last_crawled is a date field that stores when the field was last crawled. Worker ID is a id for a worker that is currently assigned to this URL. And the worker_assign_date is when the URL was assigned to a crawler. This is so that if worker X takes too long to crawl a URL, I will just assign the URL back to another worker.

我的问题是,看到我将按worker_assign_date进行很多排序,并且last_crawled进行了很多排序,那么该类型应该是什么?

My question is, seeing as I will be sorting by worker_assign_date a lot and last_crawled a lot, what should that type be?

第一要务是速度,排序速度更快吗?是Int还是DateTime?

First priority is speed, which sorts faster? Int or DateTime?

第二个优先级是大小.一个int占用的空间是否比DateTime少或更多?

Second priority is size. Does an int take up less space than DateTime or more?

请注意:互联网大约有50亿页.该数据库将容纳所有500万个URL并进行更新等.我将使用InnoDB,因此我只能锁定单行.

Please note: The internet has roughly 5 billion pages. This database is to hold all 5 million URLs and make updates etc. I will use InnoDB so I can lock single rows only.

更新

在2099年之前可以工作"的unix时间戳为4099770061,因此将其存储为INT(10)绰绰有余.根据mysql 文档所述,这将占用4个个字节.日期时间字段将占用8个字节.因此,时间戳似乎至少更短.是真的吗?

A unix timestamp, that will "work" until 2099, would be 4099770061, so storing it as INT(10) would be more than sufficient. According to the mysql documentation, that would take up 4 bytes. A datetime field will take up 8 bytes. So, it seems that timestamps are at least smaller. Is that true?

然后,最后一个问题仍然存在,在排序过程中哪个更快?有什么区别吗?

Also, the last question then remains, which is faster during a sort? Is there any difference?

推荐答案

我找到了一篇文章,可以解决您的确切问题. Int明显更快.这个线程中有一个人进行了性能测试,并在日期时间上大力支持int.

I found an article that addresses your exact question. Int is significantly faster. This thread has a guy that ran performance tests that are hugely in favor of int over datetime.

MySQL Integer vs DateTime索引

这篇关于在MySQL中为搜索引擎存储记录时处理日期时间的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆