我如何在SOLR中索引文档? [英] How do I index documents in SOLR?

查看:173
本文介绍了我如何在SOLR中索引文档?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Ubuntu 10.04上运行Solr 1.4(通过apt-get solr-tomcat安装),它似乎工作正常。我有一些困难找到任何连贯的信息,但如何索引文件。 Im新的SOLR让我忍受!
我有一个文件夹(/ mnt /文件夹),它是一个已安装的Windows共享文件夹,它包含我想索引的Word和PDF文件,索引整个文件夹的最简单方法是什么?



SOLR的文档非常差,它无法找到任何体面的教程来完成它,所以任何帮助都非常感谢!



S

解决方案

查看 Solr wiki ,这是一个非常全面的文档。



特别参阅 ExtractingRequestHandler ,它允许您索引二进制文件,如Word和PDF文档。 下面是该主题的介绍

如果wiki对您来说不够用,还有一个关于Solr的优秀书


Im running Solr 1.4 on Ubuntu 10.04 (installed via apt-get solr-tomcat) and it seems to be working fine. Im having some difficulty finding any coherent info on how to index documents though. Im new to SOLR so bear with me! I have a folder (/mnt/folder) that is a mounted windows share, which contains Word and PDF files that I would like indexed, whats the easiest way to get SOLR to index the entire folder?

The documentation for SOLR is pretty poor, its impossilbe to find any decent tutorials on getting things done with it so any help is greatly appreciated!

S

解决方案

Take a look at the Solr wiki, it's a pretty thorough documentation.

In particular see the ExtractingRequestHandler, which allows you to index binary files like Word and PDF documents. Here's an introduction to the topic.

If the wiki isn't enough for you, there's also a great book about Solr.

这篇关于我如何在SOLR中索引文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆