如何规范公司名称 [英] How to normalize company names

查看:101
本文介绍了如何规范公司名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有用户生成的各种名称的雇主名称.例如,人们输入或输入过:

We have user generated names of employers that come in all variations. For example, people have typed in or imported:

Google
Google,Inc.
Google Inc.
谷歌公司

Google
Google, Inc.
Google Inc.
Google inc

对于数据库搜索,这看起来像是另一家公司.我们已经进行了一些更改,以将每个雇主都映射到一个规范化"的名称,但是总共有70,000个,用手很难做到这一点.

To a database search this, looks like a different company all together. We've changed some things to map each employer to a "normalized" name, but with 70,000 in total, it becomes hard to do it by hand.

有人对如何规范现有条目以及如何保持我们对所有传入名称的建议吗?

Does anyone have suggestions on how to normalize the existing entries, and also how to maintain we do it for all incoming names as well?

推荐答案

您可以做两件事来帮助您:

There are two things you can do to help:

  • 当用户添加公司名称时,请给他们一个自动完成框,以便他们获得建议(如果该名称已经存在).另外,建议您添加问题时像stackoverflow这样的现有问题.

  • When users are adding a company name, give them an autocomplete box so that they get suggestions if it already exists. Alternatively suggest an existing one like stackoverflow does when you add a question.

在查询数据库时使用搜索工具,以便可以汇总所有变体.您可以在 https://www.ruby-toolbox.com/categories/rails_search

Use a search tool when querying the database so that you can summarise all variations. You can find search gems here https://www.ruby-toolbox.com/categories/rails_search

我认为事后对它们进行标准化"既简单又不准确.

I don't think "normalizing" them after the fact will be easy nor accurate.

这篇关于如何规范公司名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆