Oracle文本不能与NVARCHAR2一起使用。还有什么可能不可用? [英] Oracle Text will not work with NVARCHAR2. What else might be unavailable?

查看:213
本文介绍了Oracle文本不能与NVARCHAR2一起使用。还有什么可能不可用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们要迁移应用程序以支持Unicode,并且必须在整个数据库的unicode字符集或存储在N [VAR] CHAR2中的unicode列之间进行选择。



我们知道,如果我们选择NVARCHAR2,我们将不再有可能使用Oracle Text对列内容建立索引,因为Oracle Text只能根据CHAR类型索引列。



除此之外,从Oracle的可能性收获时,是否可能会出现其他主要差异?



此外,在新版本中添加一些新功能



感谢您的回答。



< h3>请注意Justin的回答:

谢谢您的回答。我将讨论您的观点,适用于我们的案例:



我们的应用程序通常独立于Oracle数据库,并负责处理
数据本身。连接到数据库的其他软件仅限于Toad,
Tora或SQL开发人员。



我们还使用SQL * Loader和SQL * Plus与数据库用于基本的
语句或在产品的版本之间升级。我们有
没有听说过有关NVARCHAR2的所有软件的任何特定问题。



我们也不知道我们的客户中的数据库管理员会
喜欢使用数据库上的其他工具不能支持
NVARCHAR2上的数据,我们并不真正关心他们的工具是否可能中断,
毕竟他们熟练的工作,如果必要可能找到其他工具。



您最后两点对我们的案例更有洞察力。我们不使用来自Oracle的许多
内置软件包,但仍然会发生。我们将探索
的问题。



如果我们的应用程序(使用Visual C ++编译)使用 wchar_t
存储UTF-16,必须对所有处理的数据执行编码转换吗?

解决方案

如果您有任何选项,请为整个数据库使用Unicode字符集。




  • 有很多第三方工具和库,它们不支持NCHAR / NVARCHAR2列或不使NCHAR / NVARCHAR2列工作愉快。例如,当你闪亮的新报告工具无法报告你的NVARCHAR2数据时,这很恼人。

  • 对于自定义应用程序,使用NCHAR / NVARCHAR2列需要跳过工作与CHAR / VARCHAR2 Unicode编码的列不同。在JDBC代码中,例如,你会不断地调用Statement.setFormOfUse方法。其他语言和框架将有其他陷阱;

  • 许多内置包只接受(或返回)一个VARCHAR2而不是一个NVARCHAR2。您仍然可以调用它们,因为隐式转换,但是您可能会遇到字符集转换问题。

  • 一般来说,能够避免数据库中的字符集转换问题,将这些问题降级到数据库实际上从客户端发送或接收数据的边缘,使得开发应用程序的工作变得更加容易。足够的工作来调试网络传输导致的字符集转换问题 - 确定一些数据损坏时,存储过程连接来自VARCHAR2和NVARCHAR2的数据,并将结果存储在VARCHAR2之前通过网络发送


  • Oracle设计了NCHAR / NVARCHAR2数据类型,适用于您试图支持不支持Unicode的传统应用程序的情况在与使用Unicode的新应用程序相同的数据库中,并且对于使用不同编码存储某些Unicode数据(即,您希望使用UTF-16编码存储大量日语数据)一个NVARCHAR2而不是UTF-8编码)。如果你不是在这两种情况之一,并不听起来像你是,我会不惜一切代价避免NCHAR / NVARCHAR2。



    回应你的后续


    我们的应用程序通常位于Oracle数据库的
    上,并负责数据本身的

    连接到数据库的其他软件限于
    Toad,Tora或SQL开发人员。


    你的意思是照顾数据本身?我希望你不是说你已经配置你的应用程序绕过Oracle的字符集转换例程,并且你自己做所有的字符集转换。



    我'm也假设你正在使用某种API /库来访问数据库,即使是OCI。你看看你需要对你的应用程序进行哪些更改来支持NCHAR / NVARCHAR2,以及你使用的API是否支持NCHAR / NVARCHAR2?事实上,在C ++中获取Unicode数据并不表示您不需要进行(潜在的重大)更改来支持NCHAR / NVARCHAR2列。


    我们还使用SQL * Loader和SQL * Plus到
    与数据库进行通信,以获得
    基本语句或在
    版本的产品之间进行升级。我们没有
    听说过关于NVARCHAR2的所有
    软件的任何特定问题。


    与NCHAR / NVARCHAR2。 NCHAR / NVARCHAR2在脚本中引入一些额外的复杂性,特别是如果您试图编码在数据库字符集中无法表示的字符串常量。


    我们也不知道数据库
    管理员在我们的客户之间
    想在
    数据库上使用其他工具,它不能支持NVARCHAR2上的数据
    ,我们不是真的
    关心他们的工具是否可能
    中断,毕竟他们是熟练的在
    他们的工作,如果
    需要,可能会找到其他工具。


    可以找到使用您的数据的替代方法,如果您的应用程序不能很好地与他们的企业报告工具或他们的企业ETL工具或他们碰巧遇到的任何桌面工具,很可能客户会怪你的应用程序,比他们的工具。它可能不会是一个显示塞,但也没有好处,使客户不必要的悲伤。这可能不会让他们使用竞争对手的产品,但不会让他们渴望拥抱您的产品。


    我们也希望性能
    如果我们的应用程序(即在Visual C ++下编译的
    )使用
    wchar_t存储UTF-16,必须
    对所有
    执行编码转换处理的数据?


    我不知道你在说什么转换。这可能回到我最初的问题,关于你是否说你绕过Oracle的NLS层来自己做字符集转换。



    我的底线是,我没有看到使用NCHAR / NVARCHAR2给你所描述的任何优势。使用它们有很多潜在的缺点。即使你可以消除99%的缺点与你的特定需求无关,但是,你仍然面临着一种情况,最多它是两种方法之间的冲洗。鉴于此,我更愿意使用最大限度地提高灵活性的方法,并且将整个数据库转换为Unicode(大概是AL32UTF8),并使用它。


    We are going to migrate an application to have it support Unicode and have to choose between unicode character set for the whole database, or unicode columns stored in N[VAR]CHAR2.

    We know that we will no more have the possibility of indexing column contents with Oracle Text if we choose NVARCHAR2, because Oracle Text can only index columns based on the CHAR type.

    Apart that, is it likely that other major differences arise when harvesting from Oracle possibilities?

    Also, is it likely that some new features are added in newer versions of Oracle, but only supporting either CHAR columns or NCHAR columns but not both?

    Thank you for your answers.

    Note following Justin's answer:

    Thank you for your answer. I will discuss your points, applied to our case:

    Our application is usually alone on the Oracle database and takes care of the data itself. Other software that connect to the database are limited to Toad, Tora or SQL developer.

    We also use SQL*Loader and SQL*Plus to communicate with the database for basic statements or to upgrade between versions of the product. We have not heard of any specific problem with all those software regarding NVARCHAR2.

    We are also not aware that database administrators among our customers would like to use other tools on the database that could not support data on NVARCHAR2 and we are not really concerned whether their tools might disrupt, after all they are skilled in their job and may find other tools if necessary.

    Your last two points are more insightful for our case. We do not use many built-in packages from Oracle but it still happens. We will explore that problem.

    Could we also expect performance breakage if our application (that is compiled under Visual C++), that uses wchar_t to store UTF-16, has to perform encoding conversions on all processed data?

    解决方案

    If you have anything close to a choice, use a Unicode character set for the entire database. Life in general is just blindingly easier that way.

    • There are plenty of third party utilities and libraries that simply don't support NCHAR/ NVARCHAR2 columns or that don't make working with NCHAR/ NVARCHAR2 columns pleasant. It's extremely annoying, for example, when your shiny new reporting tool can't report on your NVARCHAR2 data.
    • For custom applications, working with NCHAR/ NVARCHAR2 columns requires jumping through some hoops that working with CHAR/ VARCHAR2 Unicode encoded columns does not. In JDBC code, for example, you'd constantly be calling the Statement.setFormOfUse method. Other languages and frameworks will have other gotchas; some will be relatively well documented and minor others will be relatively obscure.
    • Many built-in packages will only accept (or return) a VARCHAR2 rather than a NVARCHAR2. You'll still be able to call them because of implicit conversion but you may end up with character set conversion issues.
    • In general, being able to avoid character set conversion issues within the database and relegating those issues to the edge where the database is actually sending or receiving data from a client makes the job of developing an application much easier. It's enough work to debug character set conversion issues that result from network transmission-- figuring out that some data got corrupted when a stored procedure concatenated data from a VARCHAR2 and a NVARCHAR2 and stored the result in a VARCHAR2 before it was sent over the network can be excruciating.

    Oracle designed the NCHAR/ NVARCHAR2 data types for cases where you are trying to support legacy applications that don't support Unicode in the same database as new applications that are using Unicode and for cases where it is beneficial to store some Unicode data with a different encoding (i.e. you have a large amount of Japanese data that you would prefer to store using the UTF-16 encoding in a NVARCHAR2 rather than the UTF-8 encoding). If you are not in one of those two situations, and it doesn't sound like you are, I would avoid NCHAR/ NVARCHAR2 at all costs.

    Responding to your followups

    Our application is usually alone on the Oracle database and takes care of the data itself. Other software that connect to the database are limited to Toad, Tora or SQL developer.

    What do you mean "takes care of the data itself"? I'm hoping you're not saying that you've configured your application to bypass Oracle's character set conversion routines and that you do all the character set conversion yourself.

    I'm also assuming that you are using some sort of API/ library to access the database even if that is OCI. Have you looked into what changes you'll need to make to your application to support NCHAR/ NVARCHAR2 and whether the API you're using supports NCHAR/ NVARCHAR2? The fact that you're getting Unicode data in C++ doesn't actually indicate that you won't need to make (potentially significant) changes to support NCHAR/ NVARCHAR2 columns.

    We also use SQL*Loader and SQL*Plus to communicate with the database for basic statements or to upgrade between versions of the product. We have not heard of any specific problem with all those software regarding NVARCHAR2.

    Those applications all work with NCHAR/ NVARCHAR2. NCHAR/ NVARCHAR2 introduce some additional complexities into scripts particularly if you are trying to encode string constants that are not representable in the database character set. You can certainly work around the issues, though.

    We are also not aware that database administrators among our customers would like to use other tools on the database that could not support data on NVARCHAR2 and we are not really concerned whether their tools might disrupt, after all they are skilled in their job and may find other tools if necessary.

    While I'm sure that your customers can find alternate ways of working with your data, if your application doesn't play nicely with their enterprise reporting tool or their enterprise ETL tool or whatever desktop tools they happen to be experienced with, it's very likely that the customer will blame your application rather than their tools. It probably won't be a show stopper, but there is also no benefit to causing customers grief unnecessarily. That may not drive them to use a competitor's product, but it won't make them eager to embrace your product.

    Could we also expect performance breakage if our application (that is compiled under Visual C++), that uses wchar_t to store UTF-16, has to perform encoding conversions on all processed data?

    I'm not sure what "conversions" you're talking about. This may get back to my initial question about whether you're stating that you are bypassing Oracle's NLS layer to do character set conversion on your own.

    My bottom line, though, is that I don't see any advantages to using NCHAR/ NVARCHAR2 given what you're describing. There are plenty of potential downsides to using them. Even if you can eliminate 99% of the downsides as irrelevant to your particular needs, however, you're still facing a situation where at best it's a wash between the two approaches. Given that, I'd much rather go with the approach that maximizes flexibility going forward, and that's converting the entire database to Unicode (AL32UTF8 presumably) and just using that.

    这篇关于Oracle文本不能与NVARCHAR2一起使用。还有什么可能不可用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆