如何上传docx,xl​​sx& txt文件到Marklogic Server? [英] How to upload docx, xlsx & txt files to Marklogic Server?

查看:82
本文介绍了如何上传docx,xl​​sx& txt文件到Marklogic Server?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含doc,docx,xl​​sx,pdf和txt文件的文件夹.我正在使用此XQuery将所有这些文件上传到Marklogic:-

I have a folder which contains doc, docx, xlsx, pdf and txt files. I am uploading all these files into Marklogic with this XQuery:-

for $d in xdmp:filesystem-directory("C:\uploads")//dir:entry
return 
  xdmp:document-load($d//dir:pathname,
    <options xmlns="xdmp:document-load">
    <uri>{concat("/documents/", string($d//dir:filename))}</uri>
    <permissions>{xdmp:default-permissions()}</permissions>
    <collections>{xdmp:default-collections()}</collections>
    <format>binary</format>
    </options>)

我还为数据库安装了内容处理.现在,当我上传doc和pdf文件时,它们将转换为xml& xhtml文件.但是docx,xl​​sx和& txt不会转换.有人可以告诉我为什么这些文件没有得到转换吗?

I have also installed content processing for my database. Now when I upload doc and pdf files they get converted to xml & xhtml files. But docx, xlsx, & txt do not get converted. Can somebody tell me why these files are not getting converted?

推荐答案

启用Office OpenXML Extract管道以转换.docx,.xlsx和.pptx文件.

Enable the Office OpenXML Extract pipeline to convert the .docx, .xlsx, and .pptx files.

具有这些扩展名的文件已经是XML.如果要将其扩展名更改为.zip,则可以提取文件并查看它们只是由相互关联的XML部分组成.

Files with these extensions are already XML. If you were to change their extension to .zip, you could extract and see the files are just composed of interrelated XML parts.

Office OpenXML Extract管道将解压缩Office 2007/2010文件并将其必要部分存储在主文件的同级目录中,类似于其他转换管道.该管道允许您存储原始的Open XML.目前没有进一步转换为DocBook的XHTML.

The Office OpenXML Extract pipeline will unzip Office 2007/2010 files and store their requisite parts in a directory sibling to the main file, similar to the other conversion pipelines. This pipeline allows you to store the raw Open XML. There is no further conversion to XHTML of DocBook at this time.

我知道没有.txt的转换.这些只是文本文件,将作为文本插入到MarkLogic中.您只需将文本包装在父元素中,然后将文件扩展名更改为.xml,即可转换为XML.

There is no conversion for .txt that I'm aware of. Those are just text files and will be inserted as text in MarkLogic. You could convert to XML by simply wrapping the text in a parent element and changing the file extension to .xml.

希望这会有所帮助.

这篇关于如何上传docx,xl​​sx&amp; txt文件到Marklogic Server?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆