在Aggregation管道,MapReduce或runCommand中使用存储的JavaScript函数 [英] Using stored JavaScript functions in the Aggregation pipeline, MapReduce or runCommand

查看:59
本文介绍了在Aggregation管道,MapReduce或runCommand中使用存储的JavaScript函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以使用在管道或mapreduce中另存为db.system.js.save(...)的用户定义函数?

解决方案

您保存到system.js的任何函数都可以由"JavaScript"处理语句使用,例如 $where 运算符和 mapReduce 以及可以通过_id值被分配来引用.

 db.system.js.save({ 
   "_id": "squareThis", 
   "value": function(a) { return a*a } 
})
 

并将一些数据插入样本"集合:

 { "_id" : ObjectId("55aafd2bacbed38e06f9eccf"), "a" : 1 }
{ "_id" : ObjectId("55aafea6acbed38e06f9ecd0"), "a" : 2 }
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }
 

然后:

 db.sample.mapReduce(
    function() {
       emit(null, squareThis(this.a));
    },
    function(key,values) {
        return Array.sum(values);
    },
    { "out": { "inline": 1 } }
 );
 

赠予:

    "results" : [
            {
                    "_id" : null,
                    "value" : 14
            }
    ],
 

或使用$where:

 db.sample.find(function() { return squareThis(this.a) == 9 })
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }
 

但是在两者都不"的情况下,您不能使用全局变量,例如数据库db引用或其他函数. $wheremapReduce文档均包含您可以在此处执行的操作的限制信息.因此,如果您认为要执行在另一个集合中查找数据"之类的操作,则可以将其忘记,因为它是不允许的".

每个实际上,MongoDB命令动作实际上是在幕后"对"runCommand"动作的调用.但是除非该命令的实际作用是调用JavaScript处理引擎",否则用法就变得无关紧要了.无论如何,只有几个命令可以执行此操作,即mapReducegroupeval,当然还有$where的查找操作.


聚合框架根本不以任何方式使用JavaScript.您可能会误以为其他人已经做了这样的声明,但并没有按照您认为的那样做:

 db.sample.aggregate([
    { "$match": {
        "a": { "$in": db.sample.distinct("a") }
    }}
])
 

因此不在集合管道内部运行 " ,而是在发送管道之前先评估".distinct()"调用的结果"到服务器.无论如何,就像使用外部变量一样:

 var items = [1,2,3];
db.sample.aggregate([
    { "$match": {
        "a": { "$in": items }
    }}
])
 

基本上两者都以相同的方式发送到服务器:

 db.sample.aggregate([
    { "$match": {
        "a": { "$in": [1,2,3] }
    }}
])
 

因此,不可能"在聚合管道中调用"任何JavaScript函数,实际上也没有任何意义通常是从system.js中保存的内容传递"结果.需要将代码"加载到客户端",并且只有JavaScript引擎才能实际执行任何操作.

在聚合框架中,所有可用的运算符"实际上是本机编码的函数,与为mapReduce提供的自由格式" JavaScript解释相反.因此,您无需编写"JavaScript",而可以使用运算符本身:

 db.sample.aggregate([
    { "$group": {
        "_id": null,
        "sqared": { "$sum": {
           "$multiply": [ "$a", "$a" ]
        }}
    }}
])

{ "_id" : null, "sqared" : 14 }
 

因此,对于保存在 system.js中的函数的操作有局限性,那么您可能要做的就是:

  • 不允许,例如访问另一个集合中的数据
  • 实际上并不需要,因为逻辑通常是自包含的
  • 或者可能以客户端逻辑或其他不同形式更好地实现

我真正想到的唯一实际用途是,您有许多无法以其他方式完成的"mapReduce"操作,并且有各种共享"功能,您宁愿只存储在服务器上而不是在每个mapReduce函数调用中进行维护.

但是再说一次,在聚合框架上使用mapReduce的90%原因通常是集合的文档结构"选择不当,并且需要" JavaScript功能来遍历文档以进行搜索和分析. /p>

因此您可以在允许的限制下使用它,但是在大多数情况下,您可能根本不应该使用此功能,而要解决其他导致您认为首先需要此功能的问题.

Is there a way to use a user-defined function saved as db.system.js.save(...) in pipeline or mapreduce?

解决方案

Any function you save to system.js is available for usage by "JavaScript" processing statements such as the $where operator and mapReduce and can be referenced by the _id value is was asssigned.

db.system.js.save({ 
   "_id": "squareThis", 
   "value": function(a) { return a*a } 
})

And some data inserted to "sample" collection:

{ "_id" : ObjectId("55aafd2bacbed38e06f9eccf"), "a" : 1 }
{ "_id" : ObjectId("55aafea6acbed38e06f9ecd0"), "a" : 2 }
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }

Then:

db.sample.mapReduce(
    function() {
       emit(null, squareThis(this.a));
    },
    function(key,values) {
        return Array.sum(values);
    },
    { "out": { "inline": 1 } }
 );

Gives:

   "results" : [
            {
                    "_id" : null,
                    "value" : 14
            }
    ],

Or with $where:

db.sample.find(function() { return squareThis(this.a) == 9 })
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }

But in "neither" case can you use globals such as the database db reference or other functions. Both $where and mapReduce documentation contain information of the limits of what you can do here. So if you thought you were going to do something like "look up data in another collection", then you can forget it because it is "Not Allowed".

Every MongoDB command action is actually a call to a "runCommand" action "under the hood" anyway. But unless what that command is actually doing is "calling a JavaScript processing engine" then the usage becomes irrelevant. There are only a few commands anyway that do this, being mapReduce, group or eval, and of course the find operations with $where.


The aggregation framework does not use JavaScript in any way at all. You might be mistaking just as others have done a statement like this, which does not do what you think it does:

db.sample.aggregate([
    { "$match": {
        "a": { "$in": db.sample.distinct("a") }
    }}
])

So that is "not running inside" the aggregation pipeline, but rather the "result" of that .distinct() call is "evaluated" before the pipeline is sent to the server. Much as with an external variable is done anyway:

var items = [1,2,3];
db.sample.aggregate([
    { "$match": {
        "a": { "$in": items }
    }}
])

Both essentially send to the server in the same way:

db.sample.aggregate([
    { "$match": {
        "a": { "$in": [1,2,3] }
    }}
])

So it is "not possible" to "call" any JavaScript function in the aggregation pipeline, nor is there really any point is "passing in" results in general from something saved in system.js. The "code" needs to be "loaded to the client" and only a JavaScript engine can actually do anything with it.

With the aggregation framework, all of the "operators" available are actually natively coded functions as opposed to the "free form" JavaScript interpretation provided for mapReduce. So instead of writing "JavaScript", you use the operators themselves:

db.sample.aggregate([
    { "$group": {
        "_id": null,
        "sqared": { "$sum": {
           "$multiply": [ "$a", "$a" ]
        }}
    }}
])

{ "_id" : null, "sqared" : 14 }

So there are limitations on what you can do with functions saved in system.js, and the chances are that what you want to do is either:

  • Not allowed, such as accessing data from another collection
  • Not really required as the logic is generally self contained anyway
  • Or probably better implemented in client logic or other different form anyway

Just about the only practical use I can really think of is that you have a number of "mapReduce" operations that cannot be done any other way and you have various "shared" functions that you would rather just store on the server than maintain within every mapReduce function call.

But then again, the 90% reason for mapReduce over the aggregation framework is usually that the "document structure" of the collections has been poorly chosen and the JavaScript functionality is "required" to traverse the document for search and analysis.

So you can use it under the allowed constraints, but in most cases you probably should not be using this at all, but fixing the other issues that caused you to believe you needed this feature in the first place.

这篇关于在Aggregation管道,MapReduce或runCommand中使用存储的JavaScript函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆