如何提高Marklogic中的xdmp:document-filter()性能? [英] How to improve xdmp:document-filter() performance in Marklogic?

查看:119
本文介绍了如何提高Marklogic中的xdmp:document-filter()性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用xdmp:document-filter(doc())从文档(doc,docx,pdf等)中提取元数据.我们之所以使用它,是因为它适用于所有类型的文档格式,并且会为每种类型的文档生成 XHTML 格式.但是此命令的主要缺点是它减慢了查询速度.如果数据库中有一个或两个文档,则查询工作正常,但如果有更多文档(例如10或15),则查询速度会变慢.我们要从数据库中所有文档的元数据中提取并显示信息.

I am using xdmp:document-filter(doc()) to extract metadata from documents(doc, docx, pdf etc). We are using this because it works for all kinds of document format and generates the XHTML format for every kind of document. But the major drawback of this command is that it slows down the query. If there are one or two documents in the database then the query works fine but if there are more documents (e.g. 10 or 15) then the query slows down. We want to extract and show the information from the metadata of all the documents in the database.

我们正在使用以下查询:-

We are using this query:-

for $d in fn:doc()
return xdmp:document-filter(doc(fn:base-uri($d)))

是否有任何方法可以使此查询更快地运行,或者是否可以替代xdmp:document-filter()?

Is there any way to make this query work faster or is there any alternative to xdmp:document-filter() ?

推荐答案

xdmp:document-filter()通常在ETL时间使用.如果使用Information Studio加载内容,则可以添加过滤文档"转换.您可以选择在将提取的元数据存储为单独的xhtml文档或作为文档属性之间进行选择.这样,就不必在每次请求时都进行即时计算.

The xdmp:document-filter() is typically used at ETL time. If you use Information Studio to load your content, then you can add a 'Filter documents' transform. You can choose between storing the extracted metadata as separate xhtml documents, or as document properties. That way they don't need to be calculated on the fly at each request.

HTH!

这篇关于如何提高Marklogic中的xdmp:document-filter()性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆