无法通过JAVA API将带有ISO编码的xml写入Marklogic [英] Not able to write xml with iso encoding to Marklogic via JAVA API

查看:142
本文介绍了无法通过JAVA API将带有ISO编码的xml写入Marklogic的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正尝试通过JAVA API将带有ISO编码的xml插入MarkLogic,但是会收到此错误。 xml包含特殊字符,例如:注册商标符号 - < h4> ®< / h4>

We are trying to insert an xml with ISO encoding to MarkLogic through JAVA API but gets this error. The xml contains special characters, for example: registered trademark sign - <h4> ® </h4>

Bad Request. Server Message: XDMP-DOCUTF8SEQ: Invalid UTF-8 escape sequence at  line 14145 -- document is not UTF-8 encoded. 

代码:

DatabaseClient client = DatabaseClientFactory.newClient(IP, PORT,
                DATABASE_NAME, USERNAME, PWD, Authentication.DIGEST);
            // acquire the content
            InputStream xmlDocStream = XMLController.class.getClassLoader()
                    .getResourceAsStream("path to xml file");

            // create a manager for XML documents
            XMLDocumentManager xmlDocMgr = client.newXMLDocumentManager();

            // create a handle on the content
            InputStreamHandle xmlhandle = new InputStreamHandle(xmlDocStream);

            // write the document content
            xmlDocMgr.write("/" + filename, xmlhandle);


推荐答案

Sravan:

解决方案是通过在InputStreamReader中包装输入流来读取资源时指定当前的ISO编码:

The solution is to specify the current ISO encoding when you read the resource by wrapping the input stream in an InputStreamReader:

http:// docs.oracle.com/javase/8/docs/api/java/io/InputStreamReader.html#InputStreamReader-java.io.InputStream-java.lang.String-

当Java API知道内容具有不同的编码但是假设内容已经是UTF-8时,它将转换为UTF-8。有关编码转换的更多详细信息,请参阅:

The Java API converts to UTF-8 when it knows that the content has a different encoding but otherwise assumes that the content is already UTF-8. For more detail about conversion of encoding, see:

http://docs.marklogic.com/guide/java/document-operations#id_11208

希望有所帮助,

这篇关于无法通过JAVA API将带有ISO编码的xml写入Marklogic的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆