MySQL街道地址模糊搜索 [英] MySQL street address fuzzy search

查看:82
本文介绍了MySQL街道地址模糊搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有谁知道在MySQL数据库上进行模糊街道地址搜索的好脚本(或好策略)?关键问题是:
$ b $ ul

  • 大小写(简单 - 只需使用LCASE)
  • 标点符号使用REPLACE;不确定它是否有更高效的选项)

  • 缩写(这是一个艰难的 - street = st等)。 >

    我希望能够匹配:
    123 Main st,unit B = 123 Main Street Unit b

    解决方案

    尽管它并不完美并且速度很慢,但您仍然希望通过REGEXP()使用正则表达式。



    下面是第一遍正则表达式,以匹配大多数情况(以及您的示例):

     (?isx)#search across多行并忽略大小写
    (#完全匹配
    (#st数字 - 数字词怎么样像一个或两个?
    \ d +

    \ s + #whitespace
    (#街道名称(一个或多个单词)
    [az] +
    (?:
    \s +
    [az] +
    )*

    \ s +#whitespace
    (#街道类型
    al(?:y \。?| ley)#aly,aly。或胡同
    |
    ave(?:\。| nue)? #ave,ave。或avenue
    |
    b(?lvd \。?| oulevard)#blvd,blvd。或林荫大道
    |
    c(?:t \。?| ourt)#ct,ct。或法院
    |
    cir(?:\c\。?| cle)? #cir,circ,circ。或圈出
    |
    cres(?:\。| cent)? #cres,cres。或新月
    |
    dr(?:\。| ive)? #博士,博士。或驱动
    |
    exp(?:y \。?| ressway)#expy,expy。或高速公路
    |
    f(?:wy\。?| reeway)#fwy,fwy。或高速公路
    |
    g(?:rdns\。?| ardens)#grdns,grdns。或花园
    |
    h(?:wy\。?| ighway)#hwy,hwy。或高速公路
    |
    l(?n \。?| ane)#ln,ln。或土地
    |
    m(?:nr \。?| anor)#mnr,mnr。或庄园
    |
    m(?:trwy \。?| otorway)#mtrwy,wtrwy。或高速公路
    |
    pl(?:\。| ace)? #pl,pl。或放置
    |
    r(?:d \。?| oad)#rd,rd。或道路
    |
    st(?:\。| reet)? #st,st。或街道
    |
    t(?:pk\。?| urnpike)#tpk,tpk。或收费
    |
    ter(?:\r?\。?| race)#ter,ter。,terr,terr。或露台
    |
    tr(?:l.\?| ail)#trl,trl。或者追踪
    |
    派克|公园|步行|环形|海湾|关闭|大门|高地
    |
    row | way | oval | dell | rise | vale | byway | lawn

    \ ,? #可选逗号
    \s +#空格
    (#可选号码,单位,apt或floor

    \##number
    |
    unit #unit
    |
    num(?:\。| ber)#num,num。or number
    |
    ap(?:t\。?| artment)# apt,apt。or apartment
    |
    fl(?:\。| oor)?#fl,fl。or floor

    \s +
    \\ \\ d +
    )?

    哪些会返回:



    $ 1 - 完全匹配

    <2> $ 2 - 房屋编号

    $ 3 - 街道名称



    $ 4 - 街道类型

    $ 5 - 单位或合住号码(可选)

    要在mysql中使用它,您需要删除所有注释(从'#'到eol),删除第一行(切换选项),并且将所有内容全部折叠为一行而不用任何空格。


    Does anyone know of a good script (or good strategy) for doing a fuzzy street address search on a MySQL database? The key issues are:

    • capitalization (easy -- just use LCASE)
    • punctuation (could use REPLACE; not sure it there are more efficient options)
    • abbreviations (this is the tough one -- street = st, etc.)

    I want to be able to match: 123 Main st, unit B = 123 Main Street Unit b

    解决方案

    While it's not perfect and can be quite slow, you'll want to use a regular expression via REGEXP().

    Here's a first-pass regular expression to match most cases (as well as your example):

    (?isx)                  # search across multiple lines and ignore case
    (                       # full match
      (                       # st number - what about number words like one or two?
        \d+
      )
      \s+                     # whitespace
      (                       # street name (one or more words)
        [a-z]+
        (?:
          \s+
          [a-z]+
        )*
      )
      \s+                     # whitespace
      (                       # street type
        al(?:y\.?|ley)          # aly, aly. or alley
      |
        ave(?:\.|nue)?          # ave, ave., or avenue
      |
        b(?lvd\.?|oulevard)     # blvd, blvd. or boulevard
      |
        c(?:t\.?|ourt)          # ct, ct. or court
      |
        cir(?:\c\.?|cle)?       # cir, circ, circ. or circle
      |
        cres(?:\.|cent)?        # cres, cres. or crescent
      |
        dr(?:\.|ive)?           # dr, dr. or drive
      |
        exp(?:y\.?|ressway)     # expy, expy. or expressway
      |
        f(?:wy\.?|reeway)       # fwy, fwy. or freeway
      |
        g(?:rdns\.?|ardens)     # grdns, grdns. or gardens
      |
        h(?:wy\.?|ighway)       # hwy, hwy. or highway
      |
        l(?n\.?|ane)            # ln, ln. or land
      |
        m(?:nr\.?|anor)         # mnr, mnr. or manor
      |
        m(?:trwy\.?|otorway)    # mtrwy, wtrwy. or motorway
      |
        pl(?:\.|ace)?           # pl, pl. or place
      |
        r(?:d\.?|oad)           # rd, rd. or road
      |
        st(?:\.|reet)?          # st, st. or street
      |
        t(?:pk\.?|urnpike)      # tpk, tpk. or turnpike
      |
        ter(?:\r?\.?|race)      # ter, ter., terr, terr. or terrace
      |
        tr(?:l.\?|ail)          # trl, trl. or trail
      |
        pike|park|walk|loop|bay|close|gate|highlands
      |
        row|way|oval|dell|rise|vale|byway|lawn
      )
      \,?                     # optional comma
      \s+                     # whitespace
      (                       # optional number, unit, apt or floor
        (
          \#                    # number
        |
          unit                  # unit
        |
          num(?:\.|ber)         # num, num. or number
        |
          ap(?:t\.?|artment)    # apt, apt. or apartment
        |
          fl(?:\.|oor)?         # fl, fl. or floor
        )
        \s+
        \d+
      )?
    )
    

    Which will return:

    $1 - full match

    $2 - house number

    $3 - street name

    $4 - street type

    $5 - unit or apt number (optional)

    To use this in mysql, you'll need to strip out all the comments (from '#' to eol), remove the first line (switch options), and collapse everything to a single line without any whitespace.

    这篇关于MySQL街道地址模糊搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆