对 Access 的 JDBC-ODBC Bridge 查询在包含重音字符时失败 [英] JDBC-ODBC Bridge queries to Access fail when they have accented characters

查看:30
本文介绍了对 Access 的 JDBC-ODBC Bridge 查询在包含重音字符时失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通过 JDBC-ODBC 桥从 Java 向 Access 数据库发送查询,如下所示:

I'm sending a query through the JDBC-ODBC Bridge to an Access database from Java, like this:

"SELECT * FROM localities WHERE locName='" + cityName + "'"

当 cityName 是没有重音字符的普通字符串时,结果集是正确的.但是,当 cityName 恰好是 LEÓNSAHAGÚN 之类的东西时,其中包含重音字符,那么我没有得到任何结果.在这些情况下,查询似乎失败了.在 MS Access 中运行时相同的查询工作正常,我也尝试使用 Ms Data Acces SKD 并且这些查询完美运行.

When cityName is a normal string with no accented characters, the resultset is correct. But when cityName happens to be something like LEÓN, SAHAGÚN, that is with accented characters in them, then I get no results. It seems like the query fails in these cases. The same queries when run in MS Access work all right, I also tried with Ms Data Acces SKD and these queries work perfectly.

它们仅在通过 JDBC-ODBC 桥时失败.据我了解,Java 对字符串使用 UTF-8,Access 也是如此.他们都使用Unicode.有没有人知道这个问题的解决方案?

They only fail when passing through the JDBC-ODBC Bridge. As I understand, Java uses UTF-8 for strings and so does Access. And they both use Unicode. Does anyone know any solution to this problem?

推荐答案

听起来您的 Java 源文件编码为 UTF-8,因此当 cityName 字符串包含 LEÓN 它被编码为

It sounds like your Java source file is encoded as UTF-8 so when the cityName string contains LEÓN it is encoded as

L  E  Ó     N
-- -- ----- --
4C 45 C3 93 4E

这不是 Access 存储值的方式.Access 确实将字符存储为 Unicode,但它不使用 UTF-8 编码.它使用 UTF-16LE 编码的变体,其中代码点 U+00FF 及以下的字符存储在单个字节中,代码点 U+00FF 以上的字符存储为 Null (0x0) 值,后跟它们的 UTF-16LE字节对.在这种情况下,Ó 是 U+00D3,它低于 U+00FF,因此 Access 将字符串的所有四个字符存储为单个字节:

That is not how Access stores the value. Access does store characters as Unicode, but it does not use UTF-8 encoding. It uses a variation of UTF-16LE encoding where characters with code points U+00FF and below are stored in a single byte, and characters with code points above U+00FF are stored as a Null (0x0) value followed by their UTF-16LE byte pair(s). In this case Ó is U+00D3, which is below U+00FF, so Access stores all four characters of the string as single bytes:

L  E  Ó  N
-- -- -- --
4C 45 D3 4E

最终效果是 Access 数据库中字符串的编码与 ISO 8859-1 字符集的编码相同.

The net effect is that the encoding of the string in the Access database is the same as it would be for the ISO 8859-1 character set.

这可以通过以下使用 JDBC-ODBC 桥的 Java 代码来确认.当Java源文件被编码为UTF-8时它无法找到所需的记录,但是当Java源文件被编码为cp1252在Eclipse中时它可以工作:

This can be confirmed with the following Java code which uses the JDBC-ODBC Bridge. It fails to find the desired record when the Java source file is encoded as UTF-8, but it works when the Java source file is encoded as cp1252 in Eclipse:

import java.sql.*;

public class accentTestMain {

    public static void main(String[] args) {
        String connectionString = 
                "jdbc:odbc:Driver={Microsoft Access Driver (*.mdb, *.accdb)};" + 
                "DBQ=C:\__tmp\test\accented.accdb;";
        try {
            Connection con = DriverManager.getConnection(connectionString);
            PreparedStatement stmt = con.prepareStatement("SELECT * FROM localities WHERE locName=?");
            String cityName = "LEÓN";
            stmt.setString(1, cityName);
            stmt.execute();
            ResultSet rs = stmt.getResultSet();
            if (rs.next()) {
                System.out.println(String.format("Record found, ID=%d", rs.getInt("ID")));
            }
            else {
                System.out.print("Record not found.");
            }
        } catch (SQLException e) {
            e.printStackTrace();
        }
    }

}

如果您只能支持cp1252 字符集中表示的重音字符,那么您应该可以简单地使用cp1252 作为您的Java 的编码设置源文件.

If you can make do with supporting only the accented characters represented in the cp1252 character set then you should be able to simply use cp1252 as the encoding setting for your Java source file(s).

另一方面,如果您真的需要 Access 数据库的完整 Unicode 字符支持,那么 JDBC-ODBC 桥将无法为您完成工作.这是 JDBC-ODBC Bridge 和 Access ODBC 驱动程序之间长期存在的互操作性问题,并且不会修复.(更多详情此处.)

On the other hand, if you really need full Unicode character support with an Access database then the JDBC-ODBC Bridge is not going to get the job done for you. This is a long-standing interoperability issue between the JDBC-ODBC Bridge and the Access ODBC driver, and it is not going to be fixed. (More details here.)

在这种情况下,您可能需要考虑使用 UCanAccess 这是一个纯 Java JDBCAccess 的驱动程序.使用 UCanAccess 和 UTF-8 编码的源文件的相应代码是

In that case you might want to consider using UCanAccess which is a pure-Java JDBC driver for Access. The corresponding code using UCanAccess with a UTF-8 encoded source file would be

// assumes...
//     import java.sql.*;
Connection conn=DriverManager.getConnection(
        "jdbc:ucanaccess://C:/__tmp/test/accented.accdb");
PreparedStatement ps = conn.prepareStatement(
        "SELECT ID FROM localities WHERE locName=?");
ps.setString(1, "LEÓN");
ResultSet rs = ps.executeQuery();
if (rs.next()) {
    System.out.println(String.format(
            "Record found, ID=%d", 
            rs.getInt("ID")));
}
else {
    System.out.println("Record not found.");
}

有关使用 UCanAccess 的更多信息,请参阅相关问题此处.

For more information on using UCanAccess, see the related question here.

另一种解决方案是使用 Jackcess 像这样操作 Access 数据库(同样,Java 源代码文件被编码为 UTF-8):

Another solution would be to use Jackcess to manipulate the Access database like so (again, the Java source file is encoded as UTF-8):

import java.io.File;
import java.io.IOException;
import com.healthmarketscience.jackcess.*;

public class accentTestMain {

    public static void main(String[] args) {
        Database db;
        try {
            db = DatabaseBuilder.open(new File("C:\__tmp\test\accented.accdb"));
            try {
                Table tbl = db.getTable("localities");
                Cursor crsr = CursorBuilder.createCursor(tbl.getIndex("locName"));
                if (crsr.findFirstRow(tbl.getColumn("locName"), "LEÓN")) {
                    System.out.println(String.format("Record found, ID=%d", crsr.getCurrentRowValue(tbl.getColumn("ID"))));
                }
                else {
                    System.out.println("Record not found.");
                }
            } catch (Exception e) {
                e.printStackTrace();
            } finally {
                db.close();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

这篇关于对 Access 的 JDBC-ODBC Bridge 查询在包含重音字符时失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆