用于对整数值进行排序的正确Solr fieldType是什么? [英] What is the correct Solr fieldType to use for sorting integer values?
问题描述
我正在使用Solr 3.6.1.用于包含整数值的Solr排序字段的正确字段类型是什么?我只需要此字段进行排序,就永远不会对其进行范围查询.我应该使用integer
还是sint
?
I am using Solr 3.6.1. What is the correct field type to use for a Solr sort field containing integer values? I need this field only for sorting and will never do range queries on it. Should I use integer
or sint
?
我看到在schema.xml中,有sint
类型声明为:
I see that in schema.xml, there is sint
type declared as:
<!-- Numeric field types that manipulate the value into
a string value that isn't human-readable in its internal form,
but with a lexicographic ordering the same as the numeric ordering,
so that range queries work correctly. -->
<fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
而integer
则表示以下内容:
<!-- numeric field types that store and index the text
value verbatim (and hence don't support range queries, since the
lexicographic ordering isn't equal to the numeric ordering) -->
<fieldType name="integer" class="solr.IntField" omitNorms="true"/>
我问这的主要原因是因为我在sint
字段上进行的每个Solr排序(我有很多声明为动态字段)都填充了(不可配置的)lucene fieldCache.我在stats页面(http://HOST:PORT/solr/CORE/admin/stats.jsp)的fieldCache下看到sint
排序存储为
The main reason I am asking this is because every Solr sort I do on an sint
field (I have lots of them declared as dynamic fields) populates the (unconfigurable) lucene fieldCache. I see on the stats page (http://HOST:PORT/solr/CORE/admin/stats.jsp) under fieldCache that sint
sorts are stored as
org.apache.lucene.search.FieldCache$StringIndex
integer
排序存储为
org.apache.lucene.search.FieldCache.DEFAULT_INT_PARSER
我相信它会占用更少的空间?
which I believe consumes less space?
更新:Solr 3.6.1 schema.xml已将int
声明为TrieIntField
,即as
UPDATE: Solr 3.6.1 schema.xml has int
declared as TrieIntField
i.e. as
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
上面的一个来自较旧的solr版本.
The one above was from an older solr version.
推荐答案
If you don't need range queries, use "integer" as Sorts work correctly on both
文档:-
诸如sint,sdouble之类的可伸缩FieldType有点用词不当.他们 按上述意义进行排序不是必需的,但是 进行RangeQuery查询时需要.实际上,可排序项是指 使数字按字典顺序正确排序的概念 字符串.也就是说,如果不这样做,则数字1..10排序 从字典上看是1,10,2,3 ...使用sint,但是有补救措施 这.但是,如果您不需要执行RangeQuery查询,而仅 需要在字段上排序,然后只需使用int或double或 相当于适当的班级.您将节省时间和内存.
Sortable FieldTypes like sint, sdouble are a bit of a misnomer. They are not needed for Sorting in the sense described above, but are needed when doing RangeQuery queries. Sortables, in fact, refer to the notion of making the number sort correctly lexicographically as Strings. That is, if this is not done, the numbers 1..10 sort lexicographically as 1,10, 2, 3... Using an sint, however remedies this. If, however, you don't need to do RangeQuery queries and only need to sort on the field, then just use an int or double or the equivalent appropriate class. You will save yourself time and memory.
这篇关于用于对整数值进行排序的正确Solr fieldType是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!