使用mlcp加载数据-命名空间问题 [英] Loading data with mlcp - namespace issue

查看:83
本文介绍了使用mlcp加载数据-命名空间问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将Words中的rss数据加载到MarkLogic数据库中.数据的形式如下:

I'm trying to load rss data from Wordpress into MarkLogic database. The data is in the form of following:

<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0"
xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:wp="http://wordpress.org/export/1.2/">

<item>
  <wp:post_id>1</wp:post_id>
  <wp:post_title>title 1</wp:post_title>
  <dc:creator>bob</dc:creator>
</item>
<item>
  <wp:post_id>2</title>
  <wp:post_title>title 1</wp:post_title>
  <dc:creator>john</dc:creator>
</item>
</rss>

但是,当我运行mlcp命令时,会收到以下警告,并且数据未插入数据库中:

However, when I run the mlcp command, I get following warning and data is not inserted into the database:

WARN mapreduce.ContentWriter: XDMP-DOCNONSBIND: No namespace binding for prefix wp
WARN mapreduce.ContentWriter: XDMP-DOCNONSBIND: No namespace binding for prefix dc

我使用的mlcp命令是:

The mlcp command I used is:

./mlcp.sh import -host localhost -port 8088 -username admin -password admin -input_file_path  data.xml -mode local -input_file_type aggregates -aggregate_record_element item -aggregate_uri_id post_id -output_uri_prefix /resources/ -output_uri_suffix .xml

有什么主意我可以解决这个问题吗?

Any idea how I can fix this?

谢谢!

推荐答案

您的测试用例有一条格式错误的行:<wp:post_id>2</title>.当我修复该问题并使用7.0-4修复mlcp-Hadoop2-1.2-3时,每个项目元素都会看到一条警告:

Your test case has one malformed line: <wp:post_id>2</title>. When I fix that and mlcp-Hadoop2-1.2-3 with 7.0-4, I see one warning per item element:

15/01/12 14:16:14 WARN mapreduce.ContentWriter: XDMP-DOCNONSBIND: No namespace binding for prefix wp at /resources/1.xml line 2 15/01/12 14:16:14 WARN mapreduce.ContentWriter: XDMP-DOCNONSBIND: No namespace binding for prefix wp at /resources/2.xml line 2

15/01/12 14:16:14 WARN mapreduce.ContentWriter: XDMP-DOCNONSBIND: No namespace binding for prefix wp at /resources/1.xml line 2 15/01/12 14:16:14 WARN mapreduce.ContentWriter: XDMP-DOCNONSBIND: No namespace binding for prefix wp at /resources/2.xml line 2

对我来说,这似乎是一个mlcp错误.您的名称空间声明位于item元素的级别之上,并且不会被发送到服务器.

This looks like an mlcp bug to me. Your namespace declarations are above the level of the item element, and they aren't being sent up to the server.

作为一种解决方法,您可以编辑XML.或者,您可以尝试使用以下内容 http://marklogic.github.io/recordloader/:

As a workaround, you could edit the XML. Or you could try http://marklogic.github.io/recordloader/ with something like this:

$ recordloader.sh -DCONNECTION_STRING=xcc://admin:admin@localhost:8088 \
    -DRECORD_NAME=item -DID_NAME="#AUTO" data.xml

有关其他选项,请参见 http://marklogic.github.io/recordloader/.

See http://marklogic.github.io/recordloader/ for other options.

这篇关于使用mlcp加载数据-命名空间问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆