solr可以像CMS一样提供单个文档吗? [英] Can solr serve individual documents like a CMS would?

查看:194
本文介绍了solr可以像CMS一样提供单个文档吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找建立一个房地产搜索引擎规格是
大约500 000上市
每日更新的潜在50 000上市
以干净(ish)提供的数据CSV - 需要删除字符,编码utf,通常。
50多个字段的数据(30图像,各种属性规格等)



Im有很多问题,Drupal7和Joomla无法处理。这只是数据导入。



我想要索引数据并充当搜索引擎。我有几个问题。


  1. solr可以从其索引直接提供列表吗? (如果是这样,我需要一个数据存储,如Mysql或甚至CMS)

  2. 我最好把数据放在一个简单的单表mysql DB,并使用推送文档


  3. 由于数据的困难,似乎我可以简单地做远离与许多并发症,试图弄清楚D7 / Joomla /任何其他cms的内部工作,只是把一些简单的PHP文件作为前端。



    我不需要任何花哨的东西,将使用这个项目的基本drupal模板。



    需要速度和可靠性以及出色的搜索结果。

    解决方案

    IMHO应该可以将SOLR专门用于您的目的。对于SOLR,即使对于单个服务器,50000个列表的数量不是很多,但是每10小时有500000个更新,我建议的确很多。因为您每小时将有大约50000次更新,相当于每小时的完整重新索引。



    我们也为我们的企业使用SOLR,约40-120字段。 40000项需要大约5分钟完全索引。如果你想自动高速缓存,你必须添加几分钟。



    据我所知,你的问题是小的更新周期。如果要每小时更新单个文档而不是所有50000条列表,则solr不能使用缓存,否则将必须使用多个solr服务器。 (也许对于solr 4.0你甚至可以考虑放大你的solr服务器硬件,但我怀疑3.x将有任何好处)
    不使用缓存可能会导致较慢的搜索性能,但它不必。



    由于SOLR提供您的动态字段功能,您可以为每个文档添加不同的结构。这应该符合您的各种属性要求。


    I am looking to build a realestate search engine specs are Approx 500 000 listings daily updates of potentially 50 000 listings Data supplied in clean(ish) CSV's - need to remove characters, encode utf, the usual. 50+ fields of data (30 images, various property specs etc)

    Im having a lot of problem with Drupal7 and Joomla cannot handle it. Thats just the data import.

    Im wanting to have solr index the data and serve as the search engine. I have a few questions.

    1. Can solr serve the listing directly from its index? (If so do I need a data store such as Mysql or even a CMS)
    2. Would I be better off putting the data in a simple single table mysql DB and use that to push documents to solr for index, then either load listings from the DB or from Solr index.

    Due to data difficulties, it seems I can simply do away with a lot of complications trying to figure out the inner workings of D7/Joomla/any other cms and just put a few simple php files up as the front end.

    I dont need anything fancy looking, was going to use the basic drupal template for this project.

    I need speed and reliability and excellent search results.

    解决方案

    IMHO it should be possible to use SOLR exclusively for your purpose. The number of 50000 listings is not very much for SOLR even for a single server, but 500000 updates per about 10h I suggest is indeed a lot. Since you will have about 50000 updates per hour which is equivalent to a full reindex per hour.

    We use SOLR for our enterprise, too, and with something about 40-120 fields. 40000 items do need about 5 minutes to index completely. If you want to autowarm caches you have to add perhaps some minutes to that.

    As far as I see your problem will be the small update periods. If you want to update individual documents instead of all of the 50000 listings once per hour, your solr cannot use caching or you will have to use multiple solr servers. (Perhaps for solr 4.0 you could even consider scaling up your solr server hardware, but i suspect 3.x would have any benefits from that) No use of caches could lead to slower search performance, but it does not have to.

    Since SOLR offers thy dynamic fields functionality you can add different structures per document. This should match your various properties requirement.

    这篇关于solr可以像CMS一样提供单个文档吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆