使用rmongodb在R中运行高级MongoDB查询 [英] Running advanced MongoDB queries in R with rmongodb

查看:73
本文介绍了使用rmongodb在R中运行高级MongoDB查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当MySQL使我发疯时,我试图使自己熟悉我的第一个"NoSQL" DBMS,它碰巧是 MongoDB .我正在通过 rmongodb 连接到它.

As MySQL is driving me nuts I'm trying to make myself acquainted with my first "NoSQL" DBMS and it happened to be MongoDB. I'm connecting to it via rmongodb.

我越喜欢 rmongodb 有关运行高级查询的问题/问题.

The more I play around with rmongodb, the more questions/problems come up with respect to running advanced queries.

首先,我将提供一些示例数据,然后再详细介绍我似乎无法正确指定的不同类型的查询.

First I present some example data before I go into detail about the different types of queries that I can't seem to specify correctly.

该示例摘自 MongoDB网站,并进行了简化

The example is taken from the MongoDB website and has been simplified a bit.

pkg <- "rmongodb"
if (!require(pkg, character.only=TRUE)) {
    install.packages(pkg)
    require(pkg, character.only=TRUE)   
}

# Connect to DB
db <- "test"
ns <- "posts"
mongo <- mongo.create(db=db)

# Insert document to collection 'test.users'
b <- mongo.bson.from.list(list(
    "_id"="alex", 
    name=list(first="Alex", last="Benisson"),
    karma=1.0,
    age=30,
    test=c("a", "b")
))
mongo.insert(mongo, "test.users", b)

# Insert document to collection 'test.posts'
b <- mongo.bson.from.list(list(
        "_id"="abcd",
        when=mongo.timestamp.create(strptime("2011-09-19 02:00:00",
            "%Y-%m-%d %H:%M:%s"), increment=1),
        author="alex",
        title="Some title",
        text="Some text.",
        tags=c("tag.1", "tag.2"),
        votes=5,
        voters=c("jane", "joe", "spencer", "phyllis", "li"),
        comments=list(
            list(
                who="jane", 
                when=mongo.timestamp.create(strptime("2011-09-19 04:00:00",
                    "%Y-%m-%d %H:%M:%s"), increment=1),
                comment="Some comment."
            ),
            list(
                who="meghan", 
                when=mongo.timestamp.create(strptime("2011-09-20 13:00:00",
                    "%Y-%m-%d %H:%M:%s"), increment=1),
                comment="Some comment."
            )
        )
    )
)
b
mongo.insert(mongo, "test.posts", b)

与插入JSON/BSON对象有关的两个问题:

Two questions related to inserting JSON/BSON objects:

  1. 文档'test.posts',字段voters:在这种情况下使用c()是否正确?
  2. 文档'test.posts',字段comments:指定此名称的正确方法是c()还是list()?
  1. Document 'test.posts', field voters: is it correct to use c() in this case?
  2. Document 'test.posts', field comments: what's the right way to specify this, c() or list()?

顶级查询:他们很乐意

顶级查询工作正常:

Top Level Queries: they work a treat

Top level queries work just fine:

# Get all posts by 'alex' (only titles)
res <- mongo.find(mongo, "test.posts", query=list(author="alex"), 
    fields=list(title=1L))
out <- NULL
while (mongo.cursor.next(res))
    out <- c(out, list(mongo.bson.to.list(mongo.cursor.value(res))))

> out
[[1]]
                       _id                      title 
                     "abcd"            "No Free Lunch" 

问题1:基本子级查询

如何运行简单的子级别查询"(与顶级查询相对),而该子级别查询需要进入 JSON/BSON 样式的MongoDB对象?这些子级查询使用MongoDB的点表示法,但似乎无法弄清楚如何将其映射到有效的 rmongodb 查询

Question 1: Basic Sub Level Queries

How can run a simple "sub level queries" (as opposed to top level queries) that need to reach into arbitrarily deep sublevels of a JSON/BSON style MongoDB object? These sub level queries make use of MongoDB's dot notation and I can't seem to figure out how to map that to a valid rmongodb query

使用简单的MongoDB语法,类似

In plain MongoDB syntax, something like

> db.posts.find( { comments.who : "meghan" } )

会工作.但是我不知道如何使用 rmongodb 来做到这一点.功能

would work. But I can't figure out how to do that with rmongodb functions

这是我到目前为止尝试过的

# Get all comments by 'meghan' from 'test.posts'

#--------------------
# Approach 1)
#--------------------
res <- mongo.find(mongo, "test.posts", query=list(comments=list(who="meghan")))
out <- NULL
while (mongo.cursor.next(res))
    out <- c(out, list(mongo.bson.to.list(mongo.cursor.value(res))))

> out
NULL
# Does not work

#--------------------
# Approach 2) 
#--------------------
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.object(buf, "comments")
mongo.bson.buffer.append(buf, "who", "meghan")
mongo.bson.buffer.finish.object(buf)
query <- mongo.bson.from.buffer(buf)
res <- mongo.find(mongo, "test.posts", query=query)
out <- NULL
while (mongo.cursor.next(res))
    out <- c(out, list(mongo.bson.to.list(mongo.cursor.value(res))))

> out
NULL
# Does not work

问题2:使用$运算符的查询

这些作品

Question 2: Queries Using $ Operators

These work

查询1

buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.object(buf, "age")
mongo.bson.buffer.append(buf, "$lte", 30)
mongo.bson.buffer.finish.object(buf)
criteria <- mongo.bson.from.buffer(buf)
criteria

> mongo.find.one(mongo, "test.users", query=criteria)
    _id : 2      alex
    name : 3     
        first : 2    Alex
        last : 2     Benisson

    karma : 1    1.000000
    age : 1      30.000000
    test : 4     
        0 : 2    a
        1 : 2    b

查询2

buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.object(buf, "test")
mongo.bson.buffer.append(buf, "$in", c("a", "z"))
mongo.bson.buffer.finish.object(buf)
criteria <- mongo.bson.from.buffer(buf)
criteria
mongo.find.one(mongo, "test.users", query=criteria)

但是,请注意,原子集将导致返回值NULL

However, notice that an atomic set will result in a return value of NULL

mongo.bson.buffer.append(buf, "$in", "a")
# Instead of 'mongo.bson.buffer.append(buf, "$in", c("a", "z"))'

尝试与子级查询相同,我又迷路了

Trying the same with sub level queries I'm lost again

buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.object(buf, "name")
mongo.bson.buffer.start.object(buf, "first")
mongo.bson.buffer.append(buf, "$in", c("Alex", "Horst"))
mongo.bson.buffer.finish.object(buf)
mongo.bson.buffer.finish.object(buf)
criteria <- mongo.bson.from.buffer(buf)
criteria <- mongo.bson.from.buffer(buf)
> criteria
    name : 3     
        first : 3    
            $in : 4      
                0 : 2    Alex
                1 : 2    Horst

> mongo.find.one(mongo, "test.users", query=criteria)
NULL

推荐答案

c()或list()都可以.取决于组件是否被命名以及它们是否具有相同的类型(用于列表).最好的办法是查看生成的BSON,看看您是否得到了想要的东西.为了更好地控制生成的对象,请使用mongo.bson.buffer及其上运行的函数.实际上,这就是子查询失败的原因. 注释"被创建为子对象而不是数组. mongo.bson.from.list()很方便,但它不能为您提供相同的控制,有时会猜测从复杂结构生成的内容是错误的.

Either c() or list() can be ok. Depends on whether the components are named and whether they all have the same type (for list). Best thing to do is look at the generated BSON and see if you are getting what you want. For the best control of the generated object use mongo.bson.buffer and the functions that operate on it. In fact this is why the sub-queries are failing. 'comments' is being created as a subobject rather than an array. mongo.bson.from.list() is handy but it doesn't give you the same control and sometimes it guesses wrong about what to generate from complicated structures.

对另一组数据的查询可以像这样纠正:

The query on the other set of data can be corrected like so though:

buf <- mongo.bson.buffer.create()
mongo.bson.buffer.start.object(buf, "name.first")
mongo.bson.buffer.append(buf, "$in", c("Alex", "Horst"))
mongo.bson.buffer.finish.object(buf)
criteria <- mongo.bson.from.buffer(buf)

请注意,您肯定需要在此处使用缓冲区,因为R会阻塞点号.

Note that you definitely need to use a buffer here since R will choke on the dotted name.

我希望这可以解决您的问题.如果您还有其他问题,请告诉我.

I hope this straightens out your problem. Let me know if you have any further questions.

这篇关于使用rmongodb在R中运行高级MongoDB查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆