如何加入Elasticsearch - 或在Lucene级别 [英] How to do a join in Elasticsearch -- or at the Lucene level

查看:175
本文介绍了如何加入Elasticsearch - 或在Lucene级别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我有一个具有两个大表的SQL设置:人员和项目。
一个人可以拥有许多项目。
Person和Item行都可以更改(即更新)。
我必须运行按个人和项目的方面进行过滤的搜索。



在弹性搜索中,您可以将Person作为嵌套文档项目,然后使用 has_child



但是:如果你更新一个人,我想你需要更新他们拥有的每个项目(这可能是很多)。



是正确的吗?
有没有一个很好的方式来解决这个查询在Elasticsearch?

解决方案

如已经提到的方式是父/儿童。关键是嵌套文档是非常有效的,但为了更新它们,您需要重新提交整个结构(父+嵌套文档)。虽然嵌套文档的内部实现由单独的lucene文档组成,但是这些嵌套文档不可见,也不可直接访问。事实上,当使用嵌套文档时,您需要使用适当的查询来访问它们(嵌套查询,嵌套过滤器,嵌套面等)。



另一方面,父/孩子允许您分开引用的单独的文档,可以独立更新。它的性能和内存使用成本,但它比嵌套文档更灵活。



这篇文章,弹性搜索帮助您管理关系的事实并不意味着您必须使用这些功能。在很多复杂的应用程序中,只需在处理关系的应用程序层上创建一些自定义逻辑即可。在方面,父/子也有限制:例如,您永远不能同时收回父母和子女,而不是仅允许回覆匹配的孩子(现在)的嵌套文档。


What's the best way to do the equivalent of an SQL join in Elasticsearch?

I have an SQL setup with two large tables: Persons and Items. A Person can own many items. Both Person and Item rows can change (i.e. be updated). I have to run searches which filter by aspects of both the person and the item.

In Elasticsearch, it looks like you could make Person a nested document of Item, then use has_child.

But: if you then update a Person, I think you'd need to update every Item they own (which could be a lot).

Is that correct? Is there a nice way to solve this query in Elasticsearch?

解决方案

As already mentioned the way to go is parent/child. The point is that nested documents are extremely performant but in order for them to be updated you need to re-submit the whole structure (parent + nested documents). Although the internal implementation of nested documents consists of separate lucene documents, those nested doc are not visible nor directly accessible. In fact when using nested documents you then need to use proper queries to access them (nested query, nested filter, nested facet etc.).

On the other hand parent/child allows you to have separate documents that refer to each other, which can be updated independently. It has a cost in terms of performance and memory used but it is way more flexible than nested documents.

As mentioned in this article though, the fact that elasticsearch helps you managing relations doesn't mean that you must use those features. In a lot of complex usecases it is just better to have some custom logic on the application layer that handles with relations. In facet there are limitations with parent/child too: for instance you can never get back both parent and children at the same time, as opposed to nested documents that doesn't allow to get back only matching children (for now).

这篇关于如何加入Elasticsearch - 或在Lucene级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆