SQL Server - 使用UTF-8编码定义XML类型列 [英] SQL Server - defining an XML type column with UTF-8 encoding
问题描述
SQL Server中定义的XML类型字段的默认编码是UTF-16。使用UTF-16编码的XML流插入该字段时没有问题。
The default encoding for an XML type field defined in an SQL Server is UTF-16. I have no trouble inserting into that field with UTF-16 encoded XML streams.
但是如果我尝试使用UTF-8编码的XML流插入字段,则插入尝试将收到错误响应
无法切换编码
。
But if I tried to insert into the field with UTF-8 encoded XML stream, the insert attempt would receive the error response
unable to switch encoding
.
问题:有没有办法将SQL Server列/字段定义为具有UTF-8编码?
QUESTION: Is there a way to define a SQL Server column/field as having UTF-8 encoding?
使用Spring JDBCTemplate执行插入操作。
The insertion operations are performed using Spring JDBCTemplate.
XML流由JAXB Marshaller设置为UTF-8或UTF-16编码生成。 / p>
The XML Stream was produced by JAXB Marshaller set to UTF-8 or UTF-16 encoding.
private String marshall(myDAO myTao, JAXBEncoding jaxbEncoding)
throws JAXBException{
JAXBContext jc = JAXBContext.newInstance(ObjectFactory.class);
m = jc.createMarshaller();
m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
if (jaxbEncoding!=null)
m.setProperty(Marshaller.JAXB_ENCODING, jaxbEncoding.toString());
StringWriter strw = new StringWriter();
m.marshal(myTao, strw);
String strw.toString();
}
其中......
public enum JAXBEncoding {
UTF8("UTF-8"),
UTF16("UTF-16")
;
private String value;
private JAXBEncoding(String value){
this.value = value;
}
public String toString(){
return this.value;
}
}
推荐答案
有没有办法将SQL Server列/字段定义为具有UTF-8编码?
Is there a way to define a SQL Server column/field as having UTF-8 encoding?
不, SQL Server中唯一的Unicode编码是UTF-16 Little Endian,它是 NCHAR
, NVARCHAR
, NTEXT
(从SQL Server 2005开始不推荐使用,所以不要在新开发中使用它;此外,与 NVARCHAR(MAX)相比,它很糟糕
无论如何),并处理 XML
数据类型。您没有像其他RDBMS允许的那样选择Unicode编码。
No, the only Unicode encoding in SQL Server is UTF-16 Little Endian, which is how the NCHAR
, NVARCHAR
, NTEXT
(deprecated as of SQL Server 2005 so don't use this in new development; besides, it sucks compared to NVARCHAR(MAX)
anyway), and XML
datatypes are handled. You do not get a choice of Unicode encodings like some other RDBMS's allow.
您可以将UTF-8编码的XML插入SQL Server,前提是您遵循以下三条规则: / p>
You can insert UTF-8 encoded XML into SQL Server, provided you follow these three rules:
- 传入的字符串必须是数据类型
VARCHAR
,而不是NVARCHAR
(因为NVARCHAR
总是UTF-16 Little Endian,因此无法切换编码的错误。) - XML有一个XML声明,明确声明XML的编码确实是UTF-8:
<?xml version =1.0encoding =UTF-8 ?>
。 - 字节序列必须是实际的UTF-8字节。
- The incoming string has to be of datatype
VARCHAR
, notNVARCHAR
(asNVARCHAR
is always UTF-16 Little Endian, hence the error about not being able to switch the encoding). - The XML has an XML declaration that explicitly states that the encoding of the XML is indeed UTF-8:
<?xml version="1.0" encoding="UTF-8" ?>
. - The byte sequence needs to be the actual UTF-8 bytes.
例如,我们可以导入包含尖叫的UTF-8编码的XML文档面对表情符号(我们可以通过以下链接获得该补充字符的UTF-8字节序列):
For example, we can import a UTF-8 encoded XML document containing the screaming face emoji (and we can get the UTF-8 byte sequence for that Supplementary Character by following that link):
SET NOCOUNT ON;
DECLARE @XML XML = '<?xml version="1.0" encoding="utf-8"?><root><test>'
+ CHAR(0xF0) + CHAR(0x9F) + CHAR(0x98) + CHAR(0xB1)
+ '</test></root>';
SELECT @XML;
PRINT CONVERT(NVARCHAR(MAX), @XML);
返回(在结果和消息标签中):
Returns (in both "Results" and "Messages" tabs):
这篇关于SQL Server - 使用UTF-8编码定义XML类型列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!