SOLR 在第一个方面查询时很慢,但在以后的查询中速度很快 [英] SOLR is slow at first facet query but quite fast for later queries

查看:26
本文介绍了SOLR 在第一个方面查询时很慢,但在以后的查询中速度很快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想弄清楚为什么我的 SOLR (4.1 ) 实例对于方面查询非常慢.索引大约有 200M 文档,服务器有 64GB RAM.

I'm trying to figure out why my SOLR (4.1 )instance is extremely slow for facet queries. The index has approximately 200M documents and the server has 64GB RAM.

我的查询如下所示:

q=CampaignId:1462%0ASourceDateUtc:[2014-01-01T00:00:00.000Z TO 2014-01-30T00:00:00.000Z]
&wt=xml&indent=true&rows=0
&facet=true&facet.field=UserName&facet.mincount=10&facet.method=fc

第一次点击大约需要 6 分钟,但是当结果回来时,我使用相同的查询再次搜索或稍微更改 SourceDateUtc 中的范围,它运行得非常快.

It would take like 6 minutes for the first hit but when the result comes back, I search again with the same query or slightly change the range in SourceDateUtc, it runs quite fast.

这是我的 solrconfig.xml(查询部分)

Here is my solrconfig.xml (query section)

<query>
  <!-- Cache used by SolrIndexSearcher for filters (DocSets),
         unordered sets of *all* documents that match a query.
         When a new searcher is opened, its caches may be prepopulated
         or "autowarmed" using data from caches in the old searcher.
         autowarmCount is the number of items to prepopulate.  For LRUCache,
         the autowarmed items will be the most recently accessed items.
       Parameters:
         class - the SolrCache implementation (currently only LRUCache)
         size - the maximum number of entries in the cache
         initialSize - the initial capacity (number of entries) of
           the cache.  (seel java.util.HashMap)
         autowarmCount - the number of entries to prepopulate from
           and old cache.

    <filterCache
      class="solr.LRUCache"
      size="1024"
      initialSize="512"
      autowarmCount="0"/>-->

   <!-- queryResultCache caches results of searches - ordered lists of
         document ids (DocList) based on a query, a sort, and the range
         of documents requested.  -->
    <queryResultCache
      class="solr.LRUCache"
      size="10000"
      initialSize="512"
      autowarmCount="0"/>

  <!-- documentCache caches Lucene Document objects (the stored fields for each document).
       Since Lucene internal document ids are transient, this cache will not be autowarmed.  -->
    <documentCache
      class="solr.LRUCache"
      size="1024"
      initialSize="512"
      autowarmCount="0"/>

    <!-- Example of a generic cache.  These caches may be accessed by name
         through SolrIndexSearcher.getCache().cacheLookup(), and cacheInsert().
         The purpose is to enable easy caching of user/application level data.
         The regenerator argument should be specified as an implementation
         of solr.search.CacheRegenerator if autowarming is desired.  -->
    <!--
    <cache name="myUserCache"
      class="solr.LRUCache"
      size="4096"
      initialSize="1024"
      autowarmCount="1024"
      regenerator="org.mycompany.mypackage.MyRegenerator"
      />
    -->

    <!-- An optimization that attempts to use a filter to satisfy a search.
         If the requested sort does not include a score, then the filterCache
         will be checked for a filter matching the query.  If found, the filter
         will be used as the source of document ids, and then the sort will be
         applied to that.
      -->
    <useFilterForSortedQuery>true</useFilterForSortedQuery>

    <!-- An optimization for use with the queryResultCache.  When a search
         is requested, a superset of the requested number of document ids
         are collected.  For example, of a search for a particular query
         requests matching documents 10 through 19, and queryWindowSize is 50,
         then documents 0 through 50 will be collected and cached. Any further
         requests in that range can be satisfied via the cache.
    -->
    <queryResultWindowSize>100</queryResultWindowSize>

    <!-- This entry enables an int hash representation for filters (DocSets)
         when the number of items in the set is less than maxSize. For smaller
         sets, this representation is more memory efficient, more efficient to
         iterate over, and faster to take intersections.
     -->
    <HashDocSet maxSize="3000" loadFactor="0.75"/>


    <!-- boolToFilterOptimizer converts boolean clauses with zero boost
         cached filters if the number of docs selected by the clause exceeds the
         threshold (represented as a fraction of the total index)
    -->
    <boolTofilterOptimizer enabled="true" cacheSize="32" threshold=".05"/>

    <!-- Lazy field loading will attempt to read only parts of documents on disk that are
         requested.  Enabling should be faster if you aren't retrieving all stored fields.
    -->
    <enableLazyFieldLoading>false</enableLazyFieldLoading>

    <!-- Use Cold Searcher

         If a search request comes in and there is no current
         registered searcher, then immediately register the still
         warming searcher and use it.  If "false" then all requests
         will block until the first searcher is done warming.
    -->
    <useColdSearcher>true</useColdSearcher>

</query>

我也尝试启用 filterCache 但它没有帮助.

I also tried to enable the filterCache but it doesn't help.

谢谢.

推荐答案

可能是预热问题.预热字段缓存(facet.method=fc)对于solr有效工作非常重要.如果您尚未配置预热查询,请考虑将方面查询添加到您的示例中,添加到 solrconfig.xml 中的 newsearcher 和 firstsearcher 部分.

Likely a warm-up issue. Warm-up field cache( facet.method=fc) is very important for solr to work effectively. In case you haven't configure the warmup queries, please consider to add the facet query as in your example, to newsearcher and firstsearcher section in solrconfig.xml.

http://wiki.apache.org/solr/SolrConfigXml#A.22Query.22_Related_Event_Listeners

<listener event="firstSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <lst> <str name="q">*:*</str>
              <str name="start">0</str>
              <str name="rows">10</str>
              <str name="facet">true</str>
              <str name="facet.field">UserName</str>
              <str name="facet.mincount">10</str>
              <str name="facet.method">fc</str>
        </lst>
      </arr>
</listener>

您可能还想关闭 useColdSearher

You may also want to turn off the useColdSearher

<useColdSearcher>true</useColdSearcher>

进一步阅读:

是什么让Solr 中良好的自动预热查询以及它们是如何工作的?

http://wiki.apache.org/solr/SolrFacetingOverview

这篇关于SOLR 在第一个方面查询时很慢,但在以后的查询中速度很快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆