Google App Engine搜索API [英] Google App Engine Search API
问题描述
当在Python版本的 GAE Search API
中查询搜索索引时, a>,搜索词汇与词组匹配的文档的最佳做法是什么?首先返回,然后记录单词与词组匹配的位置?例如,给定:
body =这是文档的主体,
带有一组单词
my_document = search.Document(
fields = [
search.TextField(name ='title',value ='A Set Of Words'),
search .TextField(name ='body',value = body),
])
有可能的是,如何在这个优先级中返回结果的情况下,如何在上述表单的 Document
的索引上执行搜索,其中搜索的短语是在变量中 qs
:
-
title 匹配 适量
;然后 - 文件的正文与
qs
字相符。
似乎正确的解决方案是使用 MatchScorer
,但我可能不符合这个标准,因为我之前没有使用过此搜索功能。从文档中我们不清楚如何使用 MatchScorer
,但我认为它有一个子类并重载了一些函数 - 但由于这没有记录,所以我没有深入研究该代码,我不能肯定地说。
这里有些东西是我缺少的,还是这是正确的策略?我是否错过了这类事情的记录?
为了清楚起见,这里是一个更加详细的期望结果的例子:
documents = [
dict(title =Alpha,body =A),#Alpha
dict(title =Beta,body =B Two),#Beta
dict(title =Alpha Two,body =A),#Alpha2
为文档中的doc:
search.document(
fields = [
search.TextField(name =title,value = doc。 title),
search.TextField(name =body,value = doc.body),
]
)
index.put(doc)#对于某些search.Index
#然后当我们搜索时,我们搜索标题和正文。
index.search(Alpha)
#返回[Alpha,Alpha2]
#在Title中找到搜索结果的权重更高。
index.search(Two)
#返回[Alpha2,Beta] - 注意Alpha2在标题中有'Two'。
自定义评分是我们最重要的功能要求之一。我们希望能有一个很好的方法来尽快做到这一点。
在你的特定情况下,你当然可以通过做两个单独的查询:第一个字段限制标题,第二个限制在正文。
When querying a search index in the Python version of the GAE Search API, what is the best practice for searching for items where documents with words match the title are first returned, and then documents where words match the body?
For example given:
body = """This is the body of the document,
with a set of words"""
my_document = search.Document(
fields=[
search.TextField(name='title', value='A Set Of Words'),
search.TextField(name='body', value=body),
])
If it is possible, how might one perform a search on an index of Document
s of the above form with results returned in this priority, where the phrase being searched for is in the variable qs
:
- Documents whose
title
matches theqs
; then - Documents whose body match the
qs
words.
It seems like the correct solution is to use a MatchScorer
, but I may be off the mark on this as I have not used this search functionality before. It is not clear from the documentation how to use the MatchScorer
, but I presume one subclasses it and overloads some function - but as this is not documented, and I have not delved into the code, I cannot say for sure.
Is there something here that I am missing, or is this the correct strategy? Did I miss where this sort of thing is documented?
Just for clarity here is a more elaborate example of the desired outcome:
documents = [
dict(title="Alpha", body="A"), # "Alpha"
dict(title="Beta", body="B Two"), # "Beta"
dict(title="Alpha Two", body="A"), # "Alpha2"
]
for doc in documents:
search.Document(
fields=[
search.TextField(name="title", value=doc.title),
search.TextField(name="body", value=doc.body),
]
)
index.put(doc) # for some search.Index
# Then when we search, we search the Title and Body.
index.search("Alpha")
# returns [Alpha, Alpha2]
# Results where the search is found in the Title are given higher weight.
index.search("Two")
# returns [Alpha2, Beta] -- note Alpha2 has 'Two' in the title.
Custom scoring is one of our top priority feature requests. We're hoping to have a good way to do this sort of thing as soon as possible.
In your particular case, you could of course achieve the desired result by doing two separate queries: the first one with field restriction on "title", and the second restricted on "body".
这篇关于Google App Engine搜索API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!