Google App Engine搜索API [英] Google App Engine Search API

查看:145
本文介绍了Google App Engine搜索API的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当在Python版本的 GAE Search API

中查询搜索索引时, a>,搜索词汇与词组匹配的文档的最佳做法是什么?首先返回,然后记录单词与词组匹配的位置?

例如,给定:

  body =这是文档的主体,
带有一组单词

my_document = search.Document(
fields = [
search.TextField(name ='title',value ='A Set Of Words'),
search .TextField(name ='body',value = body),
])

有可能的是,如何在这个优先级中返回结果的情况下,如何在上述表单的 Document 的索引上执行搜索,其中搜索的短语是在变量中 qs


  1. title 匹配适量;然后

  2. 文件的正文与 qs 字相符。

似乎正确的解决方案是使用 MatchScorer ,但我可能不符合这个标准,因为我之前没有使用过此搜索功能。从文档中我们不清楚如何使用 MatchScorer ,但我认为它有一个子类并重载了一些函数 - 但由于这没有记录,所以我没有深入研究该代码,我不能肯定地说。



这里有些东西是我缺少的,还是这是正确的策略?我是否错过了这类事情的记录?




为了清楚起见,这里是一个更加详细的期望结果的例子:

  documents = [
dict(title =Alpha,body =A),#Alpha
dict(title =Beta,body =B Two),#Beta
dict(title =Alpha Two,body =A),#Alpha2


为文档中的doc:
search.document(
fields = [
search.TextField(name =title,value = doc。 title),
search.TextField(name =body,value = doc.body),
]

index.put(doc)#对于某些search.Index

#然后当我们搜索时,我们搜索标题和正文。
index.search(Alpha)
#返回[Alpha,Alpha2]​​

#在Title中找到搜索结果的权重更高。
index.search(Two)
#返回[Alpha2,Beta] - 注意Alpha2在标题中有'Two'。


解决方案

自定义评分是我们最重要的功能要求之一。我们希望能有一个很好的方法来尽快做到这一点。



在你的特定情况下,你当然可以通过做两个单独的查询:第一个字段限制标题,第二个限制在正文。


When querying a search index in the Python version of the GAE Search API, what is the best practice for searching for items where documents with words match the title are first returned, and then documents where words match the body?

For example given:

body = """This is the body of the document, 
with a set of words"""

my_document = search.Document(
  fields=[
    search.TextField(name='title', value='A Set Of Words'),
    search.TextField(name='body', value=body),
   ])

If it is possible, how might one perform a search on an index of Documents of the above form with results returned in this priority, where the phrase being searched for is in the variable qs:

  1. Documents whose title matches the qs; then
  2. Documents whose body match the qs words.

It seems like the correct solution is to use a MatchScorer, but I may be off the mark on this as I have not used this search functionality before. It is not clear from the documentation how to use the MatchScorer, but I presume one subclasses it and overloads some function - but as this is not documented, and I have not delved into the code, I cannot say for sure.

Is there something here that I am missing, or is this the correct strategy? Did I miss where this sort of thing is documented?


Just for clarity here is a more elaborate example of the desired outcome:

documents = [
  dict(title="Alpha", body="A"),          # "Alpha"
  dict(title="Beta", body="B Two"),       # "Beta"
  dict(title="Alpha Two", body="A"),      # "Alpha2"
]

for doc in documents: 
  search.Document(
    fields=[
       search.TextField(name="title", value=doc.title),
       search.TextField(name="body", value=doc.body),
    ]
  )
  index.put(doc)  # for some search.Index

# Then when we search, we search the Title and Body.
index.search("Alpha")
# returns [Alpha, Alpha2]

# Results where the search is found in the Title are given higher weight.
index.search("Two")
# returns [Alpha2, Beta]  -- note Alpha2 has 'Two' in the title.

解决方案

Custom scoring is one of our top priority feature requests. We're hoping to have a good way to do this sort of thing as soon as possible.

In your particular case, you could of course achieve the desired result by doing two separate queries: the first one with field restriction on "title", and the second restricted on "body".

这篇关于Google App Engine搜索API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆