不能在MySQL中插入非拉丁符号 [英] Cannot insert non latin symbols in MySQL

查看:129
本文介绍了不能在MySQL中插入非拉丁符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用MySQL版本5.1.45编写网络应用程序,Tomcat 5.5.28和Hibernate 3



当我尝试保存包含非-latin字符(例如Упячка)错误发生:

  1589 [main] WARN org.hibernate.util.JDBCExceptionReporter  -  SQL错误:1366,SQLState:HY000 
1589 [main] ERROR org.hibernate.util.JDBCExceptionReporter - 字符串值不正确:'\xD0\xA3\xD0\xBF\xD1\x8F ...'在列1的列'名称'

休眠连接设置

 < property name =connection.driver_class> com.mysql.jdbc.Driver< / property> 
< property name =connection.url> jdbc:mysql:// localhost / E2012?characterEncoding = UTF8& amp; amp; useUnicode = true< / property>
< property name =connection.username> ***< / property>
< property name =connection.password> ***< / property>
< property name =hibernate.connection.charSet> UTF8< / property>

MySQL配置My.cnf

  [client] 
default-character-set = utf8

[mysqld]
default-character-set = utf8

即使查询集名称utf-8也无法解决问题



感谢您的帮助!

解决方案

在UTF-8中,Упячка被表示为 \x423\x43F\x44F\x447\x43A\x430 \xD0\xA3\xD0\xBF\xD1\x8F ... 意味着它们使用ISO-8859-1未正确编码。 / p>

这是一个证明这一点的测试片段:

  String s = new String(Упячка.getBytes(UTF-8),ISO-8859-1); //首先使用UTF-8解码,然后(不正确地)使用ISO-8859-1进行编码。 
for(char c:s.toCharArray()){
System.out.printf(\\ x%X,(int)c);
}

哪些打印

  \xD0\xA3\xD0\xBF\xD1\x8F\xD1\x87\xD0\xBA\xD0\xB0 

所以你的问题需要一步解决。由于您在谈论Java Web应用程序,并且此字符串可能是用户输入的结果,您确定您已经关注HTTP请求和响应编码?首先,在JSP中,您需要将以下内容添加到JSP的顶部:

 <%@ page pageEncoding =UTF -8%> 

这不仅使UTF-8中的页面呈现,而且还隐式设置了HTTP 内容类型 响应头指示客户端使用UTF-8呈现页面,以便客户端知道它应该显示任何内容并使用相同的编码处理任何表单。



现在,HTTP请求部分,对于GET请求,您需要配置有问题的servlet容器。例如,在Tomcat中,这是一个设置 URIEncoding 属性的问题。相应地,在 /conf/server.xml 中的config / http.htmlrel =noreferrer> HTTP连接器。对于POST请求,客户端(webbrowser)应该已经非常熟悉,以便使用JSP中指定的响应编码。如果没有,那么你需要带一个 Filter 其中检查设置请求编码。



有关更多背景信息,您可能会发现这篇文章有用的






除此之外,MySQL还有一个Unicode字符的问题。它仅支持UTF-8字符最多3个字节 ,不是4个字节。换句话说,只支持65535个字符的BMP范围,而不是。 PostgreSQL例如完全支持它。这可能不会伤害您的网络应用,但这当然是要记住的。


I'm writing web-app using MySQL version 5.1.45, Tomcat 5.5.28 and Hibernate 3

When I'm trying to save string that contains non-latin characters (for example Упячка) error occurs:

1589 [main] WARN org.hibernate.util.JDBCExceptionReporter - SQL Error: 1366, SQLState: HY000
1589 [main] ERROR org.hibernate.util.JDBCExceptionReporter - Incorrect string value: '\xD0\xA3\xD0\xBF\xD1\x8F...' for column 'name' at row 1

Hibernate connection settings

<property name="connection.driver_class">com.mysql.jdbc.Driver</property>
<property name="connection.url">jdbc:mysql://localhost/E2012?characterEncoding=UTF8&amp;useUnicode=true</property>
<property name="connection.username">***</property>
<property name="connection.password">***</property>
<property name="hibernate.connection.charSet">UTF8</property>

MySQL config My.cnf

[client]
 default-character-set=utf8

[mysqld]
 default-character-set=utf8

Even query set name utf-8 doesn't resolve problem

Thanks for help!

解决方案

In UTF-8, Упячка should actually be represented as \x423\x43F\x44F\x447\x43A\x430. The \xD0\xA3\xD0\xBF\xD1\x8F... implies that they are incorrectly been encoded using ISO-8859-1.

Here's a test snippet which proves this:

String s = new String("Упячка".getBytes("UTF-8"), "ISO-8859-1"); // First decode with UTF-8, then (incorrectly) encode with ISO-8859-1.
for (char c : s.toCharArray()) {
    System.out.printf("\\x%X", (int) c);
}

Which prints

\xD0\xA3\xD0\xBF\xD1\x8F\xD1\x87\xD0\xBA\xD0\xB0

So your problem needs to be solved one step before. Since you're talking about a Java webapplication and this string is likely result from user input, are you sure that you have taken care about the HTTP request and response encodings? First, in JSP, you need to add the following to top of the JSP:

<%@ page pageEncoding="UTF-8" %>

This not only renders the page in UTF-8, but it also implicitly sets a HTTP Content-Type response header instructing the client that the page is rendered using UTF-8, so that the client knows that it should display any content and process any forms using the same encoding.

Now, the HTTP request part, for GET requests you need to configure the servletcontainer in question. In Tomcat for example, this is a matter of setting the URIEncoding attribute of the HTTP connector in /conf/server.xml accordingly. For POST requests this should already be taken care by the client (webbrowser) being smart enough to use the response encoding as specified in the JSP. If it doesn't, then you'll need to bring in a Filter which checks and sets the request encoding.

For more background information you may find this article useful.


Apart from this all, MySQL has another problem with Unicode characters. It only supports UTF-8 characters up to 3 bytes, not 4 bytes. In other words, only the BMP range of 65535 characters is supported, outside not. PostgreSQL for example supports it fully. This may not hurt your webapplication, but this is certainly something to keep in mind.

这篇关于不能在MySQL中插入非拉丁符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆