数据匹配算法 [英] Data matching algorithm

查看：618 发布时间：2020/5/30 21:18:05 .net algorithm design-patterns

本文介绍了数据匹配算法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在从事一个需要实施数据匹配算法的项目。
外部系统传入它知道的有关客户的所有数据，而我设计的系统必须返回匹配的客户。这样，外部系统便知道了客户的正确ID，并获得了其他数据，或者可以更新特定客户的自己的数据。

I am currently working on a project where I a data matching algorithm needs to be implemented. An external system passes in all data it knows about a customer, and the system I design has to return the customer matched. So the external system then knows the correct id of the customer plus it gets additional data or can update its own data of the specific customer.

以下字段被传递：

名称

Name2

街道

城市

邮政编码

BankAccountNumber

BankName

BankCode

电子邮件

电话

传真

Name
Name2
Street
City
ZipCode
BankAccountNumber
BankName
BankCode
Email
Phone
Fax
Web

数据可以是高质量的，并且有很多可用的信息，但是通常数据很糟糕而且仅仅是

The data can be of high quality and alot of information is available, but often the data is crappy and just the name and address is available and might have spellings.

我正在.Net中实现该项目。我目前正在执行以下操作：

I'm implementing the project in .Net. What I currently do is something like the following:

public bool IsMatch(Customer customer)
{
    // CanIdentify just checks if the info is provided and has a specific length (e.g. > 1)
    if (CanIdentifyByStreet() && CanIdentifyByBankAccountNumber())
    {
        // some parsing of strings done before (substring, etc.)
        if(Street == customer.Street && AccountNumber == customer.BankAccountNumber) return true;
    }
    if (CanIdentifyByStreet() && CanIdentifyByZipCode() &&CanIdentifyByName())
    {
        ...
    }
}

我对上述方法不太满意。这是因为我必须为所有合理的情况（组合）编写if语句，这样我才不会错过匹配该实体的任何机会。

I am not very happy with the approach above. This is because I would have to write if statements for all reasonable cases (combinations) so I don't miss any chance of matching the entity.

所以我想也许我可以创建某种匹配分数。因此，对于每个匹配的标准，将添加一个分数。像这样：

So I thought maybe I could create some kind of matching score. So for each criteria matched, a score would be added. Like:

public bool IsMatch(Customer customer)
{
    int matchingScore = 0;
    if (CanIdentifyByStreet())
    {
        if(....)
            matchingScore += 10;
    }
    if (CanIdentifyByName())
    {
        if(....)
            matchingScore += 10;
    }
    if (CanIdentifyBankAccountNumber())
    {
        if(....)
            matchingScore += 10;
    }

    if(matchingScore > iDontKnow)
        return true;
}

这将允许我考虑所有匹配的数据，并取决于某些体重我会增加匹配分数。如果分数足够高，那就是一个匹配项。

This would allow me to take in consideration all matching data, and depending on some weight I would increase the matching score. If the score is high enough, it's a match.

知道我的问题是：是否有针对此类情况的最佳实践，例如匹配算法模式等？

Know my question is: Are there any best practices out there for such things, like matching algorithm patterns etc? Thanks alot!

数据匹配算法 [英] Data matching algorithm

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

数据匹配算法 [英] Data matching algorithm

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭