SQL Server - 使用UTF-8编码定义XML类型列 [英] SQL Server - defining an XML type column with UTF-8 encoding

查看:77
本文介绍了SQL Server - 使用UTF-8编码定义XML类型列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

SQL Server中定义的XML类型字段的默认编码是UTF-16。使用UTF-16编码的XML流插入该字段时没有问题。

The default encoding for an XML type field defined in an SQL Server is UTF-16. I have no trouble inserting into that field with UTF-16 encoded XML streams.

但是如果我尝试使用UTF-8编码的XML流插入字段,则插入尝试将收到错误响应

无法切换编码

But if I tried to insert into the field with UTF-8 encoded XML stream, the insert attempt would receive the error response
unable to switch encoding.

问题:有没有办法将SQL Server列/字段定义为具有UTF-8编码?

QUESTION: Is there a way to define a SQL Server column/field as having UTF-8 encoding?

使用Spring JDBCTemplate执行插入操作。

The insertion operations are performed using Spring JDBCTemplate.

XML流由JAXB Marshaller设置为UTF-8或UTF-16编码生成。 / p>

The XML Stream was produced by JAXB Marshaller set to UTF-8 or UTF-16 encoding.

private String marshall(myDAO myTao, JAXBEncoding jaxbEncoding)
throws JAXBException{
    JAXBContext jc = JAXBContext.newInstance(ObjectFactory.class);
    m = jc.createMarshaller();
    m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
    if (jaxbEncoding!=null)
        m.setProperty(Marshaller.JAXB_ENCODING, jaxbEncoding.toString());
    StringWriter strw = new StringWriter();
    m.marshal(myTao, strw);
    String strw.toString();
}

其中......

public enum JAXBEncoding {
    UTF8("UTF-8"),
    UTF16("UTF-16")
    ;

    private String value;
    private JAXBEncoding(String value){
        this.value = value;
    }

    public String toString(){
        return this.value;
    }
}


推荐答案


有没有办法将SQL Server列/字段定义为具有UTF-8编码?

Is there a way to define a SQL Server column/field as having UTF-8 encoding?

不, SQL Server中唯一的Unicode编码是UTF-16 Little Endian,它是 NCHAR NVARCHAR NTEXT (从SQL Server 2005开始不推荐使用,所以不要在新开发中使用它;此外,与 NVARCHAR(MAX)相比,它很糟糕无论如何),并处理 XML 数据类型。您没有像其他RDBMS允许的那样选择Unicode编码。

No, the only Unicode encoding in SQL Server is UTF-16 Little Endian, which is how the NCHAR, NVARCHAR, NTEXT (deprecated as of SQL Server 2005 so don't use this in new development; besides, it sucks compared to NVARCHAR(MAX) anyway), and XML datatypes are handled. You do not get a choice of Unicode encodings like some other RDBMS's allow.

您可以将UTF-8编码的XML插入SQL Server,前提是您遵循以下三条规则: / p>

You can insert UTF-8 encoded XML into SQL Server, provided you follow these three rules:


  1. 传入的字符串必须是数据类型 VARCHAR ,而不是 NVARCHAR (因为 NVARCHAR 总是UTF-16 Little Endian,因此无法切换编码的错误。)

  2. XML有一个XML声明,明确声明XML的编码确实是UTF-8:<?xml version =1.0encoding =UTF-8 ?>

  3. 字节序列必须是实际的UTF-8字节。

  1. The incoming string has to be of datatype VARCHAR, not NVARCHAR (as NVARCHAR is always UTF-16 Little Endian, hence the error about not being able to switch the encoding).
  2. The XML has an XML declaration that explicitly states that the encoding of the XML is indeed UTF-8: <?xml version="1.0" encoding="UTF-8" ?>.
  3. The byte sequence needs to be the actual UTF-8 bytes.

例如,我们可以导入包含尖叫的UTF-8编码的XML文档面对表情符号(我们可以通过以下链接获得该补充字符的UTF-8字节序列):

For example, we can import a UTF-8 encoded XML document containing the screaming face emoji (and we can get the UTF-8 byte sequence for that Supplementary Character by following that link):

SET NOCOUNT ON;
DECLARE @XML XML = '<?xml version="1.0" encoding="utf-8"?><root><test>'
                    + CHAR(0xF0) + CHAR(0x9F) + CHAR(0x98) + CHAR(0xB1)
                    + '</test></root>';

SELECT @XML;
PRINT CONVERT(NVARCHAR(MAX), @XML);

返回(在结果和消息标签中):

Returns (in both "Results" and "Messages" tabs):

这篇关于SQL Server - 使用UTF-8编码定义XML类型列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆