不能实时处理繁重工作负载的策略? (Webapp,用户匹配,缓存) [英] Strategy for heavy workloads that can't happen in real-time? (Webapp, User-matching, Caching)

查看:86
本文介绍了不能实时处理繁重工作负载的策略? (Webapp,用户匹配,缓存)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在用Ruby on Rails编写一个Web应用程序,该应用程序可以根据用户回答的问题来匹配用户。然后他们可以搜索一定范围的用户,系统会将搜索者与该范围内的每个用户进行匹配,然后将它们返回到有序列表中,从而使最高匹配者排在首位。

I'm currently writing a webapp in Ruby on Rails that matches users based on questions they answered. Then they can search for a range of users, the system matches the searcher with every user that falls into that range, and returns them in an ordered list so the highest match comes first.

问题在于此操作的工作量如此之大,以至于我认为我不能立即做到这一点。我已经最大程度地优化了SQL,并在一个SQL查询中完全实现了我的匹配算法,这大约需要8.2毫秒来计算2个用户(本地计算机)之间的匹配百分比。事实是,当搜索到5000个用户时,Rails会使用此用户数组,对其进行遍历并执行此查询5000次,这将在我的本地计算机上花费约50秒的时间。如果我转到PostgresSQL并使其成为存储过程,我可以减少它吗?

The problem is that this operation is such a heavy workload that I don't think I can do that just on the fly. I've already optimized my SQL to the max and realized my matching algorithm completely in one SQL query, which takes about 8.2ms to calculate the match percentage between 2 users (local machine). The thing is when there are 5000 users that got searched, Rails takes this array of users, iterates through them and performs this query 5000 times, which takes on my local machine about 50 seconds. Could I reduce this if I move to PostgresSQL and make this a stored procedure?

我现在的问题是,有什么办法?后台进程,缓存,以便用户按下搜索时仅需要几秒钟即可显示结果?还是不可能达到这种程度,我必须预先计算出匹配项并将它们存储在NoSQL或类似的数据库中,因为对于5万名用户,已经有25亿行。

My question now is, what ways are there e.g. background processes, caches so that when the user presses search it would only take a few seconds for the results to show up? Or isn't this possible in this magnitude and I have to precompute the matches and store them in a NoSQL or something like that, since for 50k users there would already be 2.5 billion rows.

推荐答案


  1. 一种方法是尝试进行一个SQL查询。现在,您正在为每个用户执行一个查询,但我的意思是全部执行一个查询。因此,一个查询将在您遍历用户时完成您正在做的工作。

  1. One way is to try to have one SQL query. Right now you are doing one query per user, but I mean one query over all. So the one query would be doing the work you are doing when you loop through the users.

您可以进行数据库缓存,并每天存储结果每个用户。您不需要NoSQL数据存储,只需执行cron作业即可将结果写入数据库。

You can do a database cache, and daily store the results for each user. You don't need a NoSQL data store for this, just a cron job to write the results to the database.

您还可以将结果存储在内存缓存中。内存缓存将在您的Web应用程序的Rails实例之间共享,因此一个副本可用于所有实例。我将通过一种检查过期条件的方法来访问结果,以测试是否需要刷新数据。

You could also store the results in memcache. The memcache would be shared between instances of Rails for your web app, so one copy would be available for all the instances. I would access the results through a method which checks for expiration conditions to test if it needs to refresh the data.

这篇关于不能实时处理繁重工作负载的策略? (Webapp,用户匹配,缓存)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆