如何使用 group_level 将超集键与子集匹配(couchdb 中的子选择?) [英] How to match a superset key to a subset using group_level (sub selects in couchdb?)

查看:12
本文介绍了如何使用 group_level 将超集键与子集匹配(couchdb 中的子选择?)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在 couchdb 中进行子选择,或者如何使用 group_level 将超集键与子集匹配

我有一个非常复杂的问题,希望对不只是第一次学习 map/reduce 和 couchdb 的人有一个不太复杂的答案.

我正在开发一个系统,该系统向客户端提供 json 清单,以使用每天更新的内容对其自身进行配置.在第一次运行时,客户端使用一些描述性标签(例如:屏幕大小、操作系统、位置)注册自己,然后服务器返回一个 group_id.客户端每天使用该 ID 来请求其清单.在后端,我们任意将共享某些标签的客户端组合在一起,以减少我们需要存储/服务的唯一清单的数量.

我们的销售/管理员有一个网络应用程序,他可以在其中设置受众以针对特定群体的特定内容.一个受众可以重叠多个组.诀窍是,当客户报告以获得新的清单时,我们需要确定哪些受众最适合该客户的群体.最匹配的受众将是标签是提交组标签子集的第一个受众,例如:

<上一页>观众1:标签A,标签B,标签C,标签D观众2:标签A,标签C组1:标签A,标签B,标签C

这个组应该匹配 Audience2,而不是 Audience1.

如果我们使用受众标签来查找最佳组匹配(换句话说,如果 group.tags 是 Audience.tags 的子集),我可以像这样构建一个非常有效的索引:

<上一页>[tagA, tagB, tagC], group1._id[tagA, tagC, tagB], group1._id[tagB, tagA, tagC], group1._id[tagB, tagC, tagA], group1._id[tagC, tagA, tagB], group1._id[tagC, tagB, tagA], group1._id

并使用 group_level=2 和 key=[tagA, tagC] 将 Audience2 与索引中的第二行进行匹配.问题是,我无法弄清楚如何从另一个方向进行此操作:将 group.tags 与 Audience.tags 的索引进行匹配,其中我们在查询时知道的标签 (group.tags) 是我们尝试匹配的标签(audience.tags)

我已经牢牢掌握了简单的 m/r 视图,但我一直在这个问题上遇到死胡同.我遇到的每个解决方案都涉及在我的视图函数中进行某种子选择,这在 couchdb 视图中不起作用......关于如何解决这样的问题的任何想法?

希望这个描述有点道理.

解决方案

我能想到的最简单的解决方案是:

  • 对每个受众的标签进行排序,并将排序后的数组作为视图的键发出.
  • 使用多个键查询视图,即使用 { 进行 POST"keys": ["key1", "key2", ...]}.

这些键是您要查找的所有可能的键,按重要性相反的顺序排列(行按指定键的顺序返回.)同样,键中的标签是排序的.

在您的示例中,键可以是:

[tagA, tagB, tagC][标签B,标签C][标签A,标签C][标签A,标签B][标签C][标签B][标签A]

第一个结果就是你想要的,所以你可以使用limit=1.

How to do sub selects in couchdb, or, how to match a superset key to a subset using group_level

I have a pretty complex question that hopefully has a not too complex answer to someone who's not just learning map/reduce and couchdb for the first time.

I am working on a system that serves a json manifest to a client to configure itself with content that updates daily. On first run the clients register themselves with a few descriptive tags (say: screen size, OS, location), and the server returns back a group_id. The client uses that id to request its manifest every day. On the backend we arbitrarily group clients together that share certain tags to cut down on the number of unique manifests we need to store/serve.

Our sales/admin person has a webapp where he can setup audiences to target specific content at specific groups. An audience can overlap multiple groups. The trick is, when the client reports in to get a fresh manifest we need to figure out which audience is the best fit to that client's group. The best matching audience will be the first audience who's tags are a subset of the submitted groups tags, e.g.:

audience1: tagA, tagB, tagC, tagD
audience2: tagA, tagC

group1: tagA, tagB, tagC

This group should match audience2, not audience1.

If we were using an audiences tags to find the best group match (in other words, if group.tags were a subset of audience.tags) I could build a really effecient index like so:

[tagA, tagB, tagC], group1._id
[tagA, tagC, tagB], group1._id
[tagB, tagA, tagC], group1._id
[tagB, tagC, tagA], group1._id
[tagC, tagA, tagB], group1._id
[tagC, tagB, tagA], group1._id

and use group_level=2 with key=[tagA, tagC] to match audience2 against the second line in the index. The problem is, I can't figure out how to do this going the other direction: matching a group.tags against an index of audience.tags, where the tags we know at query time (group.tags) are a superset of the tags we are trying to match against (audience.tags)

I've got a firm grasp on simple m/r views, but I keep hiting dead ends on this one. Every solution I come to involves doing some sort of sub select in my view function, which doesn't work in couchdb views... any ideas on how I can attack a problem like this?

Hopefully this decription makes some sense.

解决方案

The easiest solution I can think of is to:

  • sort the tags of each audience and emit the sorted array as the key of the view.
  • query the view using multiple keys, i.e. do a POST with {"keys": ["key1", "key2", ...]}.

The keys are all the possible keys you are looking for, in reverse order of importance (rows are returned in the order of the keys specified.) Again, the tags in the keys are sorted.

In your example the keys can be:

[tagA, tagB, tagC]
[tagB, tagC]
[tagA, tagC]
[tagA, tagB]
[tagC]
[tagB]
[tagA]

The first result is what you want, so you can use limit=1.

这篇关于如何使用 group_level 将超集键与子集匹配(couchdb 中的子选择?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆