Apache Solr 字符串字段还是文本字段? [英] Apache Solr string field or text field?

查看:30
本文介绍了Apache Solr 字符串字段还是文本字段?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 apache Solr 中,如果两者都能解决问题,为什么我们总是需要优先选择字符串字段而不是文本字段?

In apache Solr why do we always need to prefer string field over text field if both solves purposes?

字符串或文本如何影响索引大小、索引读取、索引创建等参数?

How string or text affects the parameters like index size, index read, index creation?

推荐答案

solr 架构中默认定义的字段有很大不同.

The fields as default defined in the solr schema are vastly different.

String 将单词/句子存储为精确字符串而不执行标记化等.通常用于存储精确匹配,例如,用于分面.

String stores a word/sentence as an exact string without performing tokenization etc. Commonly useful for storing exact matches, e.g, for facetting.

Text 通常执行标记化和二次处理(例如小写等).当我们想要匹配句子的一部分时,对于所有场景都很有用.

Text typically performs tokenization, and secondary processing (such as lower-casing etc.). Useful for all scenarios when we want to match part of a sentence.

如果以下示例 "This is a sample sentence" 被索引到两个字段,我们必须准确搜索文本 This is a sample sentence 以获得一个从 string 字段中命中,而搜索 sample(甚至启用词干处理的 samples)可能就足以从 sample 中获得命中代码>文本字段.

If the following sample, "This is a sample sentence", is indexed to both fields we must search for exactly the text This is a sample sentence to get a hit from the string field, while it may suffice to search for sample (or even samples with stemmning enabled) to get a hit from the text field.

这篇关于Apache Solr 字符串字段还是文本字段?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆