Solr是否可以保留将HTML文档格式设置为结果的格式? [英] Can Solr retain the formatting of the HTML documents whcih was fed to it in its result?

查看：66 发布时间：2021/4/8 20:33:32 solr solrj apache-tika solr-cell

本文介绍了Solr是否可以保留将HTML文档格式设置为结果的格式?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何在HTML文档中维护HTML文档的原始格式.Solr给出的结果?

How do I maintain the Original formatting of the HTML document in the results given by Solr?

我正试图在我的一个公司网站中提供搜索功能，该网站拥有数百万个文档，并且都没有类似的格式，因此很难单独格式化每个文档.

I am trying to provide search functionality in one of my companies website that is having millions of documents and all are not having similar formatting, So it is hard to individually format each document.

我正在apache网站上使用 Solr 4.1夜间构建，该站点已对solr-提供内置支持细胞和蒂卡.也就是说，我不需要分别配置它们.

I am using Solr 4.1 nightly builds at apache site which is having inbuilt support for solr-cell and tika. i.e. i need not to separately configure them.

solr-cell或tika可以在任何地方保留这些格式吗?

does solr-cell or tika retains these formatting anywhere?

如果它不保留格式，那么我需要使用solr的 resourcename 字段从物理文件位置获取每个文档，并应用突出显示和其他solr现成的功能，但是此过程是太乏味了.

If it does not retain the formatting then I'll need to fetch each document from physical file location using resourcename field of solr and apply the highlights and other solr ready made functionality, But this process is too tedious.

如果我必须使用Jayendra在答案中建议的"HTMLStripCharFilterFactory"，可以将什么用作请求处理程序?在这种情况下，我还可以提取元数据标签吗?

What can i use as a Request Handler if i have to use "HTMLStripCharFilterFactory" as suggested by Jayendra in the answer? also can i extract metadata tags in that case?

有人可以指导我吗！

感谢您的支持.!!!

Thank you for all your support.!!!

Solr是否可以保留将HTML文档格式设置为结果的格式? [英] Can Solr retain the formatting of the HTML documents whcih was fed to it in its result?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Solr是否可以保留将HTML文档格式设置为结果的格式? [英] Can Solr retain the formatting of the HTML documents whcih was fed to it in its result?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭