如何在运行时从Solr过滤大量ID [英] How to filter a huge list of ids from Solr at runtime

查看:141
本文介绍了如何在运行时从Solr过滤大量ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 products 的索引是Solr。我需要为每个客户提供定制的产品列表,这样我就不得不为每个客户排除一些特定的产品。
目前,我正在存储客户与客户的这种关系。排除SQL数据库中的产品,然后使用条件查询在Solr中对其进行过滤。有没有一种方法可以将这种关系存储在Solr本身中,这样我就不必每次都从SQL中计算出排除列表。

I have an index for products is Solr. I need to serve a customized list of products for each customer such that I have to exclude some specific products for each customer. Currently I am storing this relationship of customer & excluded products in a SQL database and then filtering them in Solr using a terms query. Is there a way I can store this relationship in Solr itself so that I dont have to calculate the exclude list every time from SQL first.

类似于我们可以做的事情使用 https:// www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html

Something very similar to what we can do in elasticsearch using https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html

我能想到的可能方法在Solr中执行以下操作:

Possible ways I could think of doing in Solr:


  1. 在产品索引本身中保留客户列表,并对其进行过滤。但是,如果我必须为所有文档重新编制索引,这确实会很痛苦。列表也可能很大。

  1. Keeping a list of customers in the products index itself, and filter on that. But this will really be a pain if I have to reindex all the documents. Also the list can be huge.

我可以想到的另一种方法是维护一个单独的核心,该核心用于保留每个客户的文档并排除product_id,并使用{!join}为客户过滤产品。

Another way that I could think of is maintaining a separate core for keeping documents per customer and excluded product_id and perform a join using {!join} to filter out products for a customer. Is it a scalable solution.

在Solr中存储此类数据的理想方法是什么。

What should be the ideal approach for storing such kind of data in Solr.

推荐答案

SQL DB是否存在性能问题?查询数据库并获取ID,然后将其发送给Solr,这是完全可以的。您将避免复杂性和数据重复。无论如何,您也必须进行一些计算才能将这些ID发送给Solr。

Are there any performance issues with the SQL DB? It is perfectly fine to query the DB and get the IDs, and send them to Solr. You would avoid complexity and data duplication. You would anyway have to do some computation to send those IDs to Solr as well.

但是要回答您的问题,是的,您确实可以存储每个客户排除的产品ID在一个单独的索引中。您将使用多值字段,并使用更新原子更新。如果这样做,请确保索引架构保持简单,不使用用于ID的分析器(只需使用 string 类型,而无需使用任何分词器或过滤器)即可。

But to answer your question, yes, you could store the excluded product IDs per customer indeed in a separate index. You would be using a multi-valued field and update using atomic updates. If you do that, make sure to keep the indexing schema simple with no analyzer used for the IDs (just use the string type without any tokenizer or filter).

您不需要执行Solr连接查询。您只需查找每个客户的产品ID(第一个查询)并将其作为CSV进行按摩,然后对从索引中检索到的产品ID进行条款查询

You do not need to do a Solr join query. You only have to lookup the product IDs per customer (1st query) and massage them as CSV, and do the terms query with the product IDs retrieved from the index (2nd query).

这篇关于如何在运行时从Solr过滤大量ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆