badoo.com 用户搜索 - 如何做到这一点? [英] badoo.com user search - how can this be done?

查看:71
本文介绍了badoo.com 用户搜索 - 如何做到这一点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Badoo.com 拥有 56.000.000 个用户个人资料.个人资料可以按性别、年龄、发色、生肖、学历等搜索,加上离家乡的距离、在线状态和注册日期.到目前为止,这似乎是可行的,即使它是对巨大表(5600 万成员...)进行的一些查询,它可以以一般方式缓存.

Badoo.com has 56.000.000 user profiles. Profiles can be searched by sex, age, hair color, zodiac, education and so on, plus distance from my hometown, online status and date of registration. So far, this seems doable even if it's quite some query on huge tables (56m members...), it can be cached in a general way.

有趣的是,他们还有一个单独的排除列表"(对于您查看的每个个人资料,您可以说您不想见到此人).另外,你的朋友也不会出现.

The interesting part is that they also have an individual "exclude list" (with every profile you look at, you can say that you don't want to meet this person). Plus, you friends don't show up either.

第二个有趣的部分是查询的 OR 部分.您可以搜索以下人员:女性、25-35 岁、金发或黑发、非吸烟者、异性恋或双性恋、处女座或双胞胎或癌症,居住在巴黎半径 50 公里范围内,并且不是您的朋友且不在您的排除列表中以及现在谁在线.许多 OR,繁重的查询,排序选项,无法缓存或预先计算所有这些,但搜索在毫秒内返回 11.298 个结果.

The second interesting part are the OR parts of the query. You can search for someone who's a woman, 25-35, blonde OR brunette, non-smoker, hetero OR bisexual, virgo OR twins OR cancer, living in a 50KM radius of Paris and who is not your friend and not on your exclude list and who's online now. Many ORs, heavy query, sort options, no way of caching or pre-calculating all this, but the search returns 11.298 results in milliseconds.

他们是如何用 5600 万个数据集和 25 万人同时使用它来做这样的事情的?全文检索索引?关系数据库?关键价值商店?有人对概念或架构有想法吗?

How do they do such a thing with 56 million datasets and 250K people using it at the same time? Fulltext search indexes? Relational Databases? Key Value Stores? Does anyone have an idea abou the concept or architecture?

推荐答案

它们很可能是使用倒排索引技术(如 Lucene 或 Sphinx)构建的.如果您正在寻找构建解决方案,我的建议是 Apache Solr(使用构建的搜索服务器卢塞恩).它非常受欢迎,拥有活跃的 OSS 社区,被 Netflix、Cnet 等网站使用.

They are most likely built using an inverted indexing technology like Lucene or Sphinx. If you are looking to build a solution, my recommendation would be Apache Solr (a search server built using Lucene). It is very popular, has an active OSS community, and is used by sites such as Netflix, Cnet etc.

这篇关于badoo.com 用户搜索 - 如何做到这一点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆