混合搜索和索引:Solr中的单词和令牌元数据 [英] Hybrid search and indexing: words and token metadata in Solr

查看：68 发布时间：2020/5/9 2:01:23 solr metadata token

本文介绍了混合搜索和索引:Solr中的单词和令牌元数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在为Solr构建一组插件，以启用混合"搜索，该搜索将匹配单词或 token (不是文档！)元数据(特定ID号).相同的单词在不同的上下文中可能具有不同的ID号，这些ID号是在外部应用程序编制索引时生成的.例如，运行"在一种情况下可能具有12345，在另一种情况下可能具有54321(取决于上下文). ID号在搜索中应具有更大的权重. (它们将在搜索时由同一外部应用程序在查询中提供.)

I am building a set of plugins for Solr to enable a "hybrid" search which would match either words or token (not document!) metadata (specific ID numbers). Same words may have different ID numbers in different context, generated in indexing time by an external application. Such as, "run" may have 12345 in one case and 54321 in another (depends on the context). The ID numbers should have more weight in the search. (They will be provided in the query in search time by the same external application.)

我了解了文档的自定义字段，我想知道是否可以在其中存储带有这些ID的Blob，但是我不确定如何将其包含在搜索中.

I read about custom fields for documents and I was wondering if we could store a blob there with these IDs, but I am not sure how to include it in the search.

还是我应该假装这些ID是同义词"(也许将它们用某种独特的标记括起来，例如 [:12345:] )并使用同义词工厂标记器?

Or should I just pretend these IDs are "synonyms" (maybe surrounding them in some kind of unique marking, like [:12345:]) and use the synonym factory tokenizers?

我是Solr的新手，但是我已经阅读了相关文档，所以我认为我从概念上理解了这一切. 此阶段的性能并不重要，这是PoC.看起来有点类似于:在Solr中的不同字段上搜索不同的标记但不完全是.哦，我也想自己标记文本，但这不是问题.

I am new to Solr but I have read the relevant documentation so I think I understand how it all works conceptually. Performance does not matter at this stage, this is a PoC. Looks like somewhat similar to: Search different tokens on different fields in Solr but not exactly. Oh, and I want to tokenise the text myself, too, but that's not an issue.

[删除了有关有效负载的部分，此处无关紧要.抱歉造成混乱]

[removed the bit about payloads, it is irrelevant here. Sorry about the confusion]

混合搜索和索引:Solr中的单词和令牌元数据 [英] Hybrid search and indexing: words and token metadata in Solr

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

混合搜索和索引:Solr中的单词和令牌元数据 [英] Hybrid search and indexing: words and token metadata in Solr

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭