Elasticsearch同义词分析器不工作 [英] Elasticsearch synonym analyzer not working

查看:250
本文介绍了Elasticsearch同义词分析器不工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编辑:要添加到此,同义词似乎与基本查询字符串查询工作。

 query_string:{
default_field:location.region.name.raw,
query:nh
}

返回New汉普郡,但是匹配查询nh没有结果。




我试图在我的Elastic索引中的位置字段中添加同义词,这样如果我执行位置搜索对于Mass,Ma或Massachusetts,每次我都会得到相同的结果。我将同义词过滤器添加到我的设置并更改了位置的映射。以下是我的设置:

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ {
filter:[
lowercase,
synonym_filter
],
tokenizer:standard
}

filter:{
synonym_filter:{
type:synonym,
synonym:[
United States,美国,美国=>美国,
阿拉巴马州,阿拉伯,阿拉伯,阿拉伯,
阿拉斯加州,阿拉斯加州,唉,唉,

Arkansas,Ar,Ark,
California,Ca,Calif,Cal,
Colorado,Co,Colo,Col,
Connecticut,Ct, Conn,
Deleware,De,Del,
Columbia District,Dc,Wash Dc,Washington Dc = ,
Georgia,Ga,
Hawaii,Hi,
Idaho,Id,Ida,
Illinois,Il,Ill,Ills,
Indiana,In,Ind,
Iowa,Ia,Ioa,
Kansas,Kans,Kan,Ks,
Kentucky,Ky,Ken,Kent ,
路易斯安那州,La,
缅因州,我,
马里兰州,Md,
马萨诸塞州,马萨诸塞州,
密歇根州,米,密歇根州,
明尼苏达州,锰,明尼苏达州,
密西西比,小姐,b $ b密苏里州,莫,
蒙大拿州, ,
Nebraska,Ne,Neb,Nebr,
Nevada,Nv,Nev,
New Hampshire,Nh => Nh,
New Jersey ,Nj => Nj,
New Mexico,Nm,N Mex,New M => Nm,
New York,Ny => Ny,
North Carolina,Nc,N Car => Nc,
North Dakota,Nd,N Dak,NoDak => Nd,
Ohio,Oh,O,
Oklahoma ,Ok,Okla,
俄勒冈州或俄勒冈州,俄勒冈州,
宾夕法尼亚州,宾夕法尼亚州,宾州,Penna,
Rhod e Island,Ri,Ri& PP,R Isl => Ri,
South Carolina,Sc,S Car => Sc,
South Dakota,Sd,S Dak,SoDak => Sd,
Tennessee,Te,Tenn,
Texas,Tx,Tex,
Utah,Ut,
Vermont,Vt,
Virginia,Va ,威斯康星州,Wi,威斯康星州,威斯康星州,弗吉尼亚州,弗吉尼亚州,弗吉尼亚州,弗吉尼亚州,弗吉尼亚州,弗吉尼亚州,华盛顿州,华盛顿州, Wisc,
Wyomin,Wi,Wyo
]
}
}

以及location.region字段的映射:

 region:{
属性:{
id:{type:long},
name:{
type:string,
分析器:同义词,
fields:{raw:{type:string,index:not_analyzed}}
}
}

$ / code>

但是同义词分析器似乎没有做任何事情。

 match:{
location.region.name:{
query :马萨诸塞州,
type:phrase,
analyzer:同义词
}
}

这将返回数百个结果,但如果将Massachusetts替换为Ma或Mass,则会得到0个结果。为什么不行?

解决方案

过滤器的顺序是

  filter:[
lowercase,
synonym_filter
]
synonym_filter
,它将不会匹配您定义的任何条目。



为了解决这个问题,我会用小写字母来定义同义词

EDIT: To add on to this, the synonyms seem to be working with basic querystring queries.

"query_string" : {
    "default_field" : "location.region.name.raw",
    "query" : "nh"
}

This returns all of the results for New Hampshire, but a "match" query for "nh" returns no results.


I'm trying to add synonyms to my location fields in my Elastic index, so that if I do a location search for "Mass," "Ma," or "Massachusetts" I'll get the same results each time. I added the synonyms filter to my settings and changed the mapping for locations. Here are my settings:

analysis":{
    "analyzer":{
        "synonyms":{
            "filter":[
                "lowercase",
                "synonym_filter"
            ],
        "tokenizer": "standard"
    }
},
"filter":{
    "synonym_filter":{
        "type": "synonym",
        "synonyms":[
            "United States,US,USA,USA=>usa",
            "Alabama,Al,Ala,Ala",
            "Alaska,Ak,Alas,Alas",
            "Arizona,Az,Ariz",
            "Arkansas,Ar,Ark",
            "California,Ca,Calif,Cal",
            "Colorado,Co,Colo,Col",
            "Connecticut,Ct,Conn",
            "Deleware,De,Del",
            "District of Columbia,Dc,Wash Dc,Washington Dc=>Dc",
            "Florida,Fl,Fla,Flor",
            "Georgia,Ga",
            "Hawaii,Hi",
            "Idaho,Id,Ida",
            "Illinois,Il,Ill,Ills",
            "Indiana,In,Ind",
            "Iowa,Ia,Ioa",
            "Kansas,Kans,Kan,Ks",
            "Kentucky,Ky,Ken,Kent",
            "Louisiana,La",
            "Maine,Me",
            "Maryland,Md",
            "Massachusetts,Ma,Mass",
            "Michigan,Mi,Mich",
            "Minnesota,Mn,Minn",
            "Mississippi,Ms,Miss",
            "Missouri,Mo",
            "Montana,Mt,Mont",
            "Nebraska,Ne,Neb,Nebr",
            "Nevada,Nv,Nev",
            "New Hampshire,Nh=>Nh",
            "New Jersey,Nj=>Nj",
            "New Mexico,Nm,N Mex,New M=>Nm",
            "New York,Ny=>Ny",
            "North Carolina,Nc,N Car=>Nc",
            "North Dakota,Nd,N Dak, NoDak=>Nd",
            "Ohio,Oh,O",
            "Oklahoma,Ok,Okla",
            "Oregon,Or,Oreg,Ore",
            "Pennsylvania,Pa,Penn,Penna",
            "Rhode Island,Ri,Ri & PP,R Isl=>Ri",
            "South Carolina,Sc,S Car=>Sc",
            "South Dakota,Sd,S Dak,SoDak=>Sd",
            "Tennessee,Te,Tenn",
            "Texas,Tx,Tex",
            "Utah,Ut",
            "Vermont,Vt",
            "Virginia,Va,Virg",
            "Washington,Wa,Wash,Wn",
            "West Virginia,Wv,W Va, W Virg=>Wv",
            "Wisconsin,Wi,Wis,Wisc",
            "Wyomin,Wi,Wyo"
        ]
    }
}

And the mapping for the location.region field:

"region":{
    "properties":{
        "id":{"type": "long"},
        "name":{
            "type": "string",
            "analyzer": "synonyms",
            "fields":{"raw":{"type": "string", "index": "not_analyzed" }}
        }
    }
}

But the synonyms analyzer doesn't seem to be doing anything. This query for example:

"match" : {
    "location.region.name" : {
        "query" : "Massachusetts",
        "type" : "phrase",
        "analyzer" : "synonyms"
    }
}

This returns hundreds of results, but if I replace "Massachusetts" with "Ma" or "Mass" I get 0 results. Why isn't it working?

解决方案

The order of the filters is

filter":[
    "lowercase",
    "synonym_filter"
]

So, if elasticsearch is "lowercasing" first the tokens, when it executes the second step, synonym_filter, it won't match any of the entries you have defined.

To solve the problem, I would define the synonyms in lower case

这篇关于Elasticsearch同义词分析器不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆