使用不同长度的传入号码(带或不带前缀)查找电话号码 [英] Phone number lookups using incoming numbers of varying lengths, with or without prefixes

查看:105
本文介绍了使用不同长度的传入号码(带或不带前缀)查找电话号码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

系统:
-呼叫中心电话系统:传入呼叫以及源号码,主叫方ID(又称"ANI")
-SQL Server 2005-数据仓库",用于在表[CustDataByANI]中存储客户电话号码([cANI]),客户名称,位置等.
-存储过程-呼叫中心软件将呼叫者ID(aka ANI)作为参数传递给SP,使用它在CustDataByANI表上执行当前简单的SELECT语句.... WHERE [cANI] = @ANI.
如果查询找到@ANI与查找表"中包含的686K cANI值之一之间的精确匹配,则此方法效果很好.这种情况仅发生在大约12%的时间.

目标:增加成功的可能/可能"比赛的次数

重要说明:我们使用的是全局数据集,因此无法执行有关任一值(参数@ANI或[cANI]中的值)长度的规则.
情况1:
电话系统发送源号码"9876543210",该来源号码用作参数@ANI
该确切数字存在于CustDataByANI表的[cANI]列中(记录号55555)
Select语句从与记录55555关联的许多其他列中返回值 非常简单:WHERE [cANI] = @ANI成功.

情况2:
@ANI =' 1 9876543210'(与上面相同,但前导'1')
在CustDataByANI.cANI
中未找到完全匹配的内容 [cANI]中最接近的匹配项是'9876543210'(仍然记录55555)
甚至一个孩子都会意识到与案例1的唯一区别是参数@ANI中存在一个1位数的前缀"-也许是长途标签"或国家/地区代码.
这样的前缀长度可能是1或2或什至3个数字....我们无法预测.我们不想考虑长度大于3的前缀,但是在这种情况下,确实希望像情况1一样从记录55555返回值.

情况3:情况2的反向"
@ANI ='9876543210'
在CustDataByANI.cANI
中未找到完全匹配的内容 [cANI]中最接近的匹配项是" 1 9876543210"(记录#55555现在具有"1"前缀)
再一次,我们将假定两者是实质等同的.在这种情况下,由于前缀,[cANI]值包含更长的序列,长度可能是1或2甚至3位数...我们无法预测.我们不想考虑长度大于3的前缀,但是在这种情况下,确实希望像情况1一样从记录55555返回值.

同样,由于每个值(@ANI和[cANI])的长度可能会变化,并且由于我几乎完全缺乏SQL编程,因此我无法为存储过程编写一个考虑了所有3种情况的SELECT语句.简单的带有通配符的"LIKE"语句似乎失败了,我的头脑在旋转CASE标准,CONTAINS甚至REVERSE策略,以从右到左的方式读取" @ANI和cANI值.

Systems:
-Call Center phone system: incoming calls accompanied by the source number, aka Caller ID, aka "ANI"
-SQL Server 2005 - the "data warehouse" which stores customer phone numbers ([cANI]), cust names, location, etc. in a table [CustDataByANI]
-Stored Procedure - call center software passes the Caller ID (aka ANI) to a SP as a parameter which uses it to perform a presently simple SELECT statement on the CustDataByANI table....WHERE [cANI] = @ANI.
This works wonderfully, provided the query finds an exact match between the @ANI and one of the 686K cANI values contained in the 'lookup table'. This happens only about 12% of the time.

Goal: to increase the number of successful "likely/probable" matches

Important note: we are using a global data set and cannot enforce rules regarding the length of either value (the parameter @ANI or the values within [cANI]).
Case 1:
The phone system transmits the source number '9876543210' which is used as the parameter @ANI
That exact number exists within the [cANI] column of the CustDataByANI table (record# 55555)
The Select statement returns the values from a number of other columns affiliated with record 55555 Super-easy: WHERE [cANI] = @ANI is successful.

Case 2:
@ANI = '19876543210' (same as above but with a leading '1')
No exact match found within CustDataByANI.cANI
Closest match in [cANI] is '9876543210' (still record 55555)
Even a child would recognize that the only difference from Case 1 is the presence of a 1-digit 'prefix' in the parameter @ANI - perhaps it is a long distance 'tag' or a country code.
Such prefixes could be 1 or 2 or even 3 digits in length....we can't predict. We do not want to consider prefixes longer than 3, but in this case do want to return the values from record 55555 as in Case 1.

Case 3: the 'reverse' of Case 2
@ANI = '9876543210'
No exact match found within CustDataByANI.cANI
Closest match in [cANI] is '19876543210' (record # 55555 now has a '1' prefix)
Again, we would assume that the two are of substantial equivalence. In this case, the [cANI] value contains the longer sequence due to the prefix, which could be 1 or 2 or even 3 digits in length...we cannot predict. We do not want to consider prefixes longer than 3, but in this case do want to return the values from record 55555 as in Case 1.

Again, due to the possible variability in the length of each value (@ANI and [cANI]) and my near complete lack of SQL programming, I cannot write a SELECT statement for the stored procedure which takes into account all 3 cases. Simple "LIKE" statements with wildcards seem to fail, and my head is spinning on CASE criteria, CONTAINS, and even REVERSE strategies for 'reading' both the @ANI and the cANI values in a right-to left fashion.

我的梦想是返回两者之间的最佳可能匹配.
我的愚蠢程序如下;任何帮助都将不胜感激!.

顺便说一句,我的源表CustDataByANI确实包含一个RevANI列,它只是反向的cANI值.最初,我以为解决方案可能在于反转@ANI参数值并在[RevANI]列中找到最大匹配项,从而在每个通配符的右侧保留所有通配符.但是我仍然陷于困境,并且不确定这是否是最佳策略....

My dream is to return the best probable match between the two.
My Stupid Procedure is below; any and all assistance is greatly appreciated!.

BTW, my source table CustDataByANI does include a RevANI column, which is simply the cANI values in reverse. Initially I had thought that the solution might lie in reversing both the @ANI parameter value and finding the greatest match within the [RevANI] column, thereby leaving any wildcards on the right of each. But I'm still stuck and am not sure if that is the best strategy....

USE [GCC]
GO
/****** Object:  StoredProcedure [dbo].[SP_GetCustDataByANI]    Script Date: 10/07/2014 07:47:34 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER PROCEDURE [dbo].[SP_GetCustDataByANI] 
    @ANI varchar(80)
AS
BEGIN

    SET NOCOUNT ON;

--Remove leading zeros from the varchar @ANI. I chose this method rather than risking
--the undesirable introduction of exponential notation when long characters are converted to
--integers and back...

    IF ((LEN(@ANI) > 1) AND (LEFT(@ANI,1)= '0'))
    BEGIN
        SET @ANI = REPLACE(LTRIM(REPLACE(@ANI,'0',' ')),' ','0')
    END

    SELECT Id
        ,cANI
        ,cServiceClass
        ,cCompanyClass
        ,cContactName
        ,cContactDivision
        ,cContactDepartment
        ,cCompanyName
        ,cOrganizationName
        ,cContactCity
        ,cContactStateTerr
        ,cContactCountry
        ,cCompanyIsDistributor
        ,PrefAgentID
        ,PrefAgentID_SQUAL
        ,PrefRegionID_SQUAL
        ,VIP_CC
        ,VIP_TS
        ,TS_ACAT
    FROM [dbo].[CustDataByANI]
    WHERE ([cANI] = @ANI)

推荐答案

如果您想加快查询速度,可以创建一个电话号码相反顺序的附加列,在此列上创建索引,然后查询使用LIKE谓词的号码,也以相反的顺序传递搜索到的电话号码.这将使查询尽可能快.例如,对于示例数据,可以将其存储在新的ReversedPhoneNumber列中:

If you want to make the queries faster you can create an aditional column with the phone number in reverse order, make an index on this column, and then query for the number using a LIKE predicate, passing the searched for phone number also in reverse order. This will give make the queries as fast as possible. For example, for your sample data, you'd store this in the new ReversedPhoneNumber column:

6543211234     store as: 4321123456
16543211234    store as: 43211234561 
0016543211234  store as: 4321123456100

当您需要按6543211234进行查询时,只需将其反转并像这样查看反转的列即可

When you need to query by 6543211234 simply reverse it and look in the reversed column like this

WHERE ReversedPhoneNumber LIKE `6543211234%`

这将与存储的任何格式的数字匹配,而且很快,因为它是一种简单而快速的索引查找. (类似于开始于"操作的LIKE会寻找索引来寻找巧合.)

This would match the number in whichever format it's stored, and very quickly, because it's a simple and fast index seek. (A LIKE that resembles an "starts with" operation will seek the index to look for coincidences).

关于您需要应用的其他规则,您比我们更了解数据.只需考虑所有可能的情况并进行一些测试,您就会获得需要应用的规则(不要加快速度,但要确保正确的匹配).

As to the additional rules that you need to apply, you know the data better than us. Simply think of all possible cases and make a few tests and you'll get the rules that you need to apply (not to speed up, bu tto ensure a correct match).

您可以在ETL过程中冲销电话号码.

You can reverse the phone number in the ETL process.

缺少一些细节可以提供更好的建议.

There are missing details to give a better advice.

注意:如果无法将列和索引添加到现有表中,则只需创建一个差异表来保存与现有表相关的反向编号.您还可以添加触发器来维护此表

这篇关于使用不同长度的传入号码(带或不带前缀)查找电话号码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆