用空值创建Couch DB视图有什么问题吗? [英] Is there anything wrong with creating Couch DB views with null values?

查看:84
本文介绍了用空值创建Couch DB视图有什么问题吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我在业余时间一直在与Couch DB进行大量工作,并且非常喜欢使用它.我发现它比使用关系数据库要灵活得多,但这并不是没有缺点.

一个很大的缺点是缺乏动态查询/视图生成...因此,您必须在计划和证明视图方面做大量工作,因为您无法尽可能地将这种逻辑放入应用程序代码中使用SQL.

例如,我写了一个基于JSON文档模板的登录方案,看起来有点像这样:

{ 
   "_id": "blah",
   "type": "user",
   "name": "Bob",
   "email": "bob@theaquarium.com",
   "password": "blah",
}

为了防止创建重复的帐户,我编写了一个非常基本的视图来生成要查找的用户名列表作为键:

emit(doc.name, null) 

对我来说,这似乎相当有效.我认为这比拖出整个文档列表(甚至每个文档的字段减少)更好.因此,我做了完全相同的事情来生成电子邮件地址列表:

emit(doc.email, null)

您能看到我要问的问题吗?

在关系数据库(使用SQL)中,只需对同一张表进行两个查询. (将视图等同于SQL查询产品的)这种技术在某种程度上会类似吗?

然后是性能/效率问题...那两个视图真的应该只是一个吗?还是使用带有键且没有关联值的Couch DB视图是一种有效的做法?考虑到上面的示例,这两个视图都可以在登录方案之外使用...如果我需要生成用户名列表,则无需额外的开销即可检索它们.

您怎么看?

解决方案

首先,您当然可以可以将视图逻辑放入您的应用程序代码中-您所需要的只是提取合适的构建或部署系统应用程序中的视图,并将它们添加到设计文档中.缺少的是即时生成新查询的能力.

您的emit(doc.field,null)方法当然不足为奇或不寻常.实际上,这是按字段查找文档"查询的常用模式,其中使用include_docs=true提取文档.也无需将两个视图混合在一起,唯一与性能相关的决定是是否将两个视图放置在同一设计文档中:访问任何一个设计文档时,所有视图都将更新. >

当然,即使您的应用程序非常努力,您的方法也不能真正保证电子邮件的唯一性.想象一下两个客户端应用程序A和B的以下情况:

A: queries view, determines that `test@email.com` does not exist.
B: queries view, determines that `test@email.com` does not exist.
A: creates account with `test@email.com`
B: creates account with `test@email.com`

这是一种罕见的情况,但是还是有可能的.更好的方法是保留使用电子邮件地址作为密钥的文档,因为对单个文档的访问是事务性的(不可能用相同的密钥创建两个文档).典型示例:

{
  _id: "test@email.com",
  type: "email"
  user: "000000001"
}

{
  _id: "000000001",
  type: "user", 
  email: "test@email.com",
  firstname: "Test", 
  ...
}

仅当两个尝试为给定电子邮件创建帐户的客户可靠地尝试访问相同文档时,保留模式才有效.如果您随机生成一个新的标识符,则客户端A将创建并保留文档XXXX,而客户端B将创建并保留文档YYYY,最后您将得到两个具有相同电子邮件的不同文档.

同样,执行事务性检查是否存在,如果不存在则创建"操作的唯一方法是让所有客户端更改单个文档.

I've been doing a fair amount of work with Couch DB in my spare time recently and really enjoy using it. I find it to be much more flexible than using a relational database, but it's not without it's disadvantages.

One big disadvantage is the lack of dynamic queries / view generation... So you have to do a fair amount of work in planning and justifying your views, as you can't put that logic into your application code as you might do with SQL.

For example, I wrote a login scheme based on a JSON document template that looked a little bit like this:

{ 
   "_id": "blah",
   "type": "user",
   "name": "Bob",
   "email": "bob@theaquarium.com",
   "password": "blah",
}

To prevent the creation of duplicate accounts, I wrote a very basic view to generate a list of user names to lookup as keys:

emit(doc.name, null) 

This seemed reasonably efficient to me. I think it's way better than dragging out an entire list of documents (or even just a reduced number of fields for each document). So I did exactly the same thing to generate a list of email addresses:

emit(doc.email, null)

Can you see where I'm going with this question?

In a relational database (with SQL) one would simply make two queries against the same table. Would this technique (of equating a view to the product of an SQL query) be in some way analogous?

Then there's the performance / efficiency issue... Should those two views really be just one? Or is the use of a Couch DB view with keys and no associated value an effective practice? Considering the example above, both of those views would have uses outside of a login scheme... If I ever need to generate a list of user names, I can retrieve them without an additional overhead.

What do you think?

解决方案

First, you certainly can put the view logic into your application code - all you need is an appropriate build or deploy system that extracts the views from the application and adds them to a design document. What is missing is the ability to generate new queries on the fly.

Your emit(doc.field,null) approach certainly isn't surprising or unusual. In fact, it is the usual pattern for "find document by field" queries, where the document is extracted using include_docs=true. There is also no need to mix the two views into one, the only performance-related decision is whether the two views should be placed in the same design document: all views in a design document are updated when any of them is accessed.

Of course, your approach does not actually guarantee that the e-mails are unique, even if your application tries really hard. Imagine the following circumstances with two client applications A and B:

A: queries view, determines that `test@email.com` does not exist.
B: queries view, determines that `test@email.com` does not exist.
A: creates account with `test@email.com`
B: creates account with `test@email.com`

This is a rare occurrence, but nonetheless possible. A better approach is to keep documents that use the email address as the key, because access to single documents is transactional (it's impossible to create two documents with the same key). Typical example:

{
  _id: "test@email.com",
  type: "email"
  user: "000000001"
}

{
  _id: "000000001",
  type: "user", 
  email: "test@email.com",
  firstname: "Test", 
  ...
}

EDIT: a reservation pattern only works if two clients attempting to create an account for a given e-mail will reliably try to access the same document. If you randomly generate a new identifier, then client A will create and reserve document XXXX while client B will create and reserve document YYYY, and you will end up with two different documents that have the same e-mail.

Again, the only way to perform a transactional "check if it exists, create if it does not" operation is to have all clients alter a single document.

这篇关于用空值创建Couch DB视图有什么问题吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆