Oracle文本包含和技术内容 [英] Oracle Text Contains and technical content

查看:101
本文介绍了Oracle文本包含和技术内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在搜索技术字"AN-XYZ99".所以我用

I'am searching for the technical word "AN-XYZ99". So I use

SELECT *
FROM foo
WHERE CONTAINS(bar, 'AN{-}XYZ99') > 0

但是我也得到了类似"FO-XYZ99"或"BAR-XYZ99"的结果.我该怎么做才能确保预期结果?

but I get also results like "FO-XYZ99" or "BAR-XYZ99". What can I do to ensure the expected result?

我用过

BEGIN
    CTX_DDL.CREATE_PREFERENCE('FOO','BASIC_LEXER');
    CTX_DDL.SET_ATTRIBUTE('FOO', 'ALTERNATE_SPELLING', 'GERMAN');
    CTX_DDL.SET_ATTRIBUTE('FOO', 'COMPOSITE', 'GERMAN');
    CTX_DDL.SET_ATTRIBUTE('FOO', 'MIXED_CASE', 'NO');
END;

列"列中的数据采样(VARCHAR2(4000)):

Sample data from column "bar" (VARCHAR2(4000)):

"unbekannt Stadt Text: AN-XYZ99 << foobar Straße 31.12.2017 Datum Host 20160101 foo"
"unbekannt Stadt Text: FO-XYZ99 << foobar Straße 31.12.2017 Datum Host 20160101 bar"
"unbekannt Stadt Text: BAR-XYZ99 << foobar Straße 31.12.2017 Datum Host 20160101 bla"

使用上面的语句,我希望将第一行作为输出,但是我也将获得第二行和第三行.

With the Statement above I would like the first row as output but I get the second and third row as well.

Oracle Database 11g企业版11.2.0.3.0版-64位生产

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

推荐答案

首先,您必须在词法分析器中将连字符定义为printjoin.

First you must define hyphen as a printjoin in your lexer.

使用

select IXV_ATTRIBUTE, IXV_VALUE from CTXSYS.CTX_INDEX_VALUES where IXV_CLASS =  'LEXER';

IXV_ATTRIBUTE                  IXV_VALUE     
-----------------------------------------
PRINTJOINS                     _$%&-         
NUMJOIN                        .              
NUMGROUP                       .              
WHITESPACE                     ,= 

然后,您可以(使用此词法分析器重新创建索引之后)验证令牌是否符合预期:(您的表将根据索引名称而有所不同;请检查所有表,例如"DR $%$ I")

Then you may (after re-creating index with this lexer) validate that the tokens are as expected: (your table would vary based on the index name; check all tables like 'DR$%$I')

select TOKEN_TEXT from DR$TEXTIDX_IDX$I where TOKEN_TEXT like '%-XYZ99';
TOKEN_TEXT                                                     
----------------------------------------------------------------
AN-XYZ99                                                         
BAR-XYZ99                                                        
FO-XYZ99

现在您可以查询搜索字符串.

Now you may query for the search string.

显然,您必须对连字符进行转义,因为BAR-XYZ99会找到BAR 包含XYZ99的行;尽管hyphen with no space文档是有点不同.

Aparently you must escape the hyphen as BAR-XYZ99 will find rows with BAR not containing XYZ99 ; although the documentation of hyphen with no space is a bit different.

SELECT SCORE(1),txt
FROM textidx
WHERE  CONTAINS(txt, 'BAR-XYZ99',1) > 0; 

  SCORE(1) TXT                                                                                
---------- ------------------------------------------------------------------------------------
         4 unbekannt Stadt Text: FO-XYZ99 << foobar Straße 31.12.2017 Datum Host 20160101 bar

由于某种原因(我在11.2.0.2.0上),使用花括号进行转义不起作用(不返回任何匹配项),但是使用反斜杠就可以了.

For some reason (I'm on 11.2.0.2.0) the escaping with curly braces doesn't work (returns no match), but using backslash is fine.

SELECT SCORE(1),txt
FROM textidx
WHERE  CONTAINS(txt, 'BAR\-XYZ99',1) > 0;  

  SCORE(1) TXT                                                                                
---------- ------------------------------------------------------------------------------------
         4 unbekannt Stadt Text: BAR-XYZ99 << foobar Straße 31.12.2017 Datum Host 20160101 bla 

这篇关于Oracle文本包含和技术内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆