Heroku上的Solr配置 [英] Solr configuration on Heroku

查看:117
本文介绍了Heroku上的Solr配置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Heroku上使用WebSolr Cobalt。
如果我搜索第一个字母还是整个单词,但是没有单词的部分部分,搜索就行。

有什么帮助吗?

解决方案

p>您必须编辑本地的schema.xml文件,通常在solr / config下添加:


  1. NGramFilterFactory

  2. EdgeNGramFilterFactory

这就是我的样子 - sample schema.xml
$ b EdgeNGram



我选择了EdgeN选项。它不允许在单词中间搜索,但它允许从单词的开头开始进行部分单词搜索。这会减少你不想要的误报/比赛,表现更好,并且通常不会被用户错过。另外,我喜欢minGramSize = 2,因此您必须输入最少2个字符。



一旦您的本地设置并正常运行,您必须编辑websolr使用的schema.xml,否则您将获得默认行为,该行为需要即使您为模型配置了全文搜索,也要输入完整的单词。



编辑websolr schema.xml


  1. 转至您的应用的Heroku在线仪表板

  2. 转至资源选项卡,然后单击Websolr附加软件
  3. 点击索引下的默认链接

  4. 点击高级配置链接

  5. 粘贴在你本地的schema.xml中,包括你选择的Ngram标记器(上面提到)的配置。保存。

  6. 在配置您的Heroku应用程序框中复制该链接,然后将其粘贴到终端中,以便在您的heroku配置中设置您的WEBSOLR_URL链接。

  7. 点击索引状态链接以获得漂亮的统计信息,并查看您运行得很快还是很慢。

  8. 重新索引一切




heroku run rake sunspot:reindex [5000]





  • 不要使用heroku运行rake sunspot:solr:reindex - 不推荐使用,不接受任何参数,并且方式更慢

  • 默认批量大小是50,大多数人建议使用1000,但是我已经看到显着更快的结果(每秒1000行,而不是500 rps),通过将它提升到5000 +


将它提升到一个新的水平
$ b 5种加速索引的方法


I am using WebSolr Cobalt on Heroku. The search is working if I search whether for the first letter or the full word, but no partial parts of the word.

Any help?

解决方案

To enable partial word searching

you must edit your local schema.xml file, usually under solr/config, to add either:

  1. NGramFilterFactory
  2. EdgeNGramFilterFactory

Here's what mine looks like - sample schema.xml

EdgeNGram

I went with the EdgeN option. It doesn't allow for searching in the middle of words, but it does allow partial word search starting from the beginning of the word. This cuts way down on false positives / matches you don't want, performs better, and is usually not missed by the users. Also, I like the minGramSize=2 so you must enter a minimum of 2 characters. Some folks set this to 3.

Once your local is setup and working, you must edit the schema.xml used by websolr, otherwise you will get the default behavior which requires the full-word to be entered even if you have full text searching configured for your models.

To edit the websolr schema.xml

  1. Go to the Heroku online dashboard for your app
  2. Go to the resources tab, then click on the Websolr add-on
  3. Click the default link under Indexes
  4. Click on the Advanced Configuration link
  5. Paste in your schema.xml from your local, including the config for your Ngram tokenizer of choice (mentioned above). Save.
  6. Copy the link in the "Configure your Heroku application" box, then paste it into terminal to set your WEBSOLR_URL link in your heroku config.
  7. Click the Index Status link to get nifty stats and see if you are running fast or slow.
  8. Reindex everything

heroku run rake sunspot:reindex[5000]

  • Don't use heroku run rake sunspot:solr:reindex - it is deprecated, accepts no parameters and is WAY slower
  • Default batch size is 50, most people suggest using 1000, but I've seen significantly faster results (1000 rows per second as opposed to around 500 rps) by bumping it up to 5000+

Take it to the next level

5 ways to speed up indexing

这篇关于Heroku上的Solr配置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆