Mongoose找到一个并推送到一系列文档 [英] Mongoose find one and push to array of documents

查看:87
本文介绍了Mongoose找到一个并推送到一系列文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是MongoDB和Mongoose的新手,我正在尝试使用它来保存日间交易分析的股票价格。所以我想象这个Schema:

I'm new to MongoDB and Mongoose and I'm trying to use it to save stock ticks for daytrading analysis. So I imagined this Schema:

symbolSchema = Schema({
    name:String,
    code:String
});

quoteSchema = Schema({
    date:{type:Date, default: now},
    open:Number, 
    high:Number,
    low:Number,
    close:Number,
    volume:Number
});

intradayQuotesSchema = Schema({
    id_symbol:{type:Schema.Types.ObjectId, ref:"symbol"},
    day:Date,
    quotes:[quotesSchema]
});

从我的链接中我每分钟收到这样的信息:

From my link I receive information like this every minute:

日期|符号|打开|高|低|关闭|成交量

2015-03-09 13:23:00 | AAPL | 127,14 | 127,17 | 127,12 | 127,15 | 19734

2015-03-09 13:23:00|AAPL|127,14|127,17|127,12|127,15|19734

我必须:


  1. 找到符号的ObjectId(AAPL) 。

  2. 发现此符号的intradayQuote文档是否已存在(符号和日期组合)

  3. 发现此符号的分钟OHLCV数据是否存在于引号数组(因为它可以重复)

  4. 更新或创建文档并在数组中更新或创建引号

  1. Find the ObjectId of the symbol (AAPL).
  2. Discover if the intradayQuote document of this symbol already exists (symbol and date combination)
  3. Discover if the minute OHLCV data of this symbol exists on the quotes array (because it could be repeated)
  4. Update or create the document and update or create the quotes inside the array

如果引号已经存在,我可以完成此任务而不会发生这种情况,但是这种方法可以在引号数组中创建重复的条目:

I'm able to accomplish this task without veryfing if the quotes already exists, but this method can creates repeated entries inside quotes array:

symbol.find({"code":mySymbol}, function(err, stock) {
    intradayQuote.findOneAndUpdate({
        { id_symbol:stock[0]._id, day: myDay },
        { $push: { quotes: myQuotes } },
        { upsert: true },
        myCallback
    });
});

我已经尝试过:


  • $ addToSet 而不是$ push,但不幸的是,这似乎不适用于文档数组

  • {id_symbol:stock [0] ._ id,day:myDay,'quotes [date]':myDate} 条件为 findOneAndUpdate ;但不幸的是,如果mongo找不到它,它会为分钟创建一个新文档,而不是附加到引号数组。

  • $addToSet instead of $push, but unfortunatelly this doesn't seems to work with array of documents
  • { id_symbol:stock[0]._id, day: myDay, 'quotes["date"]': myDate } on the conditions of findOneAndUpdate; but unfortunatelly if mongo doesn't find it, it creates a new document for the minute instead of appending to the quotes array.

有没有办法让这个工作不再使用一个查询(我已经使用2)?我应该重新考虑我的架构以促进这项工作吗?任何帮助将不胜感激。谢谢!

Is there a way to get this working without using one more query (I'm already using 2)? Should I rethink my Schema to facilitate this job? Any help will be appreciated. Thanks!

推荐答案

基本上放一个 $ addToSet 运算符无法为您工作,因为您的数据不是真的set定义为完全不同的对象的集合。

Basically put an $addToSet operator cannot work for you because your data is not a true "set" by definition being a collection of "completely distinct" objects.

这里的另一个逻辑意义是,您将在数据到达时处理数据,无论是作为sinlge对象还是Feed。我会假设它以某种形式提供了许多项目,你可以使用某种流处理器来获得每个收到的文件的结构:

The other piece of logical sense here is that you would be working on the data as it arrives, either as a sinlge object or a feed. I'll presume its a feed of many items in some form and that you can use some sort of stream processor to arrive at this structure per document received:

{
    "date": new Date("2015-03-09 13:23:00.000Z"),
    "symbol": "AAPL",
    "open": 127.14
    "high": 127.17,
    "low": 127.12 
    "close": 127.15,
    "volume": 19734
}

转换为标准十进制格式以及UTC日期,因为任何区域设置确实应该是应用程序的域当然,一旦从数据存储区中检索数据。

Converting to a standard decimal format as well as a UTC date since any locale settings really should be the domain of your application once data is retrieved from the datastore of course.

我还会通过删除对其他集合的引用并将其放置,至少使您的intraDayQuoteSchema变得扁平化。那里的数据。您仍然需要查询插入,但是读取时额外填充的开销似乎比存储开销更昂贵:

I would also at least flatten out your "intraDayQuoteSchema" a little by removing the reference to the other collection and just putting the data in there. You would still need a lookup on insertion, but the overhead of the additional populate on read would seem to be more costly than the storage overhead:

intradayQuotesSchema = Schema({
    symbol:{
        name: String,
        code: String
    },
    day:Date,
    quotes:[quotesSchema]
});

这取决于你的使用模式,但它可能更有效。

It depends on you usage patterns, but it's likely to be more effective that way.

其余部分真正归结为

stream.on(function(data) {

    var symbol = data.symbol,
        myDay = new Date( 
            data.date.valueOf() - 
                ( data.date.valueOf() % 1000 * 60 * 60 * 24 ));
    delete data.symbol;

    symbol.findOne({ "code": symbol },function(err,stock) {

        intraDayQuote.findOneAndUpdate(
            { "symbol.code": symbol , "day": myDay },
            { "$setOnInsert": { 
               "symbol.name": stock.name
               "quotes": [data] 
            }},
            { "upsert": true }
            function(err,doc) {
                intraDayQuote.findOneAndUpdate(
                    {
                        "symbol.code": symbol,
                        "day": myDay,
                        "quotes.date": data.date
                    },
                    { "$set": { "quotes.$": data } },
                    function(err,doc) {
                        intraDayQuote.findOneAndUpdate(
                            {
                                "symbol.code": symbol,
                                "day": myDay,
                                "quotes.date": { "$ne": data.date }
                            },
                            { "$push": { "quotes": data } },
                            function(err,doc) {

                            }
                       );    
                    }
                );
            }
        );    
    });
});

如果您在回复中实际上不需要修改后的文档,那么您可以通过实施获得一些好处这里有批量操作API,并在一个数据库请求中发送此包中的所有更新:

If you don't actually need the modified document in the response then you would get some benefit by implementing the Bulk Operations API here and sending all updates in this package within a single database request:

stream.on("data",function(data) {

    var symbol = data.symbol,
        myDay = new Date( 
            data.date.valueOf() - 
                ( data.date.valueOf() % 1000 * 60 * 60 * 24 ));
    delete data.symbol;

     symbol.findOne({ "code": symbol },function(err,stock) {
         var bulk = intraDayQuote.collection.initializeOrderedBulkOp();
         bulk.find({ "symbol.code": symbol , "day": myDay })
             .upsert().updateOne({
                 "$setOnInsert": { 
                     "symbol.name": stock.name
                     "quotes": [data] 
                 }
             });

         bulk.find({
             "symbol.code": symbol,
             "day": myDay,
             "quotes.date": data.date
         }).updateOne({
             "$set": { "quotes.$": data }
         });

         bulk.find({
             "symbol.code": symbol,
             "day": myDay,
             "quotes.date": { "$ne": data.date }
         }).updateOne({
             "$push": { "quotes": data }
         });

         bulk.execute(function(err,result) {
             // maybe do something with the response
         });            
     });
});

重点是只有其中一个语句会实际修改数据,因为这都是发送的在同一个请求中,应用程序和服务器之间来回减少。

The point is that only one of the statements there will actually modify data, and since this is all sent in the same request there is less back and forth between the application and server.

另一种情况是,在这种情况下,获取实际数据可能更简单在另一个集合中引用。这只是处理upserts的一个简单问题:

The alternate case is that it might just be more simple in this case to have the actual data referenced in another collection. This then just becomes a simple matter of processing upserts:

intradayQuotesSchema = Schema({
    symbol:{
        name: String,
        code: String
    },
    day:Date,
    quotes:[{ type: Schema.Types.ObjectId, ref: "quote" }]
});


// and in the steam processor

stream.on("data",function(data) {

    var symbol = data.symbol,
        myDay = new Date( 
            data.date.valueOf() - 
                ( data.date.valueOf() % 1000 * 60 * 60 * 24 ));
    delete data.symbol;

    symbol.findOne({ "code": symbol },function(err,stock) {
         quote.update(
            { "date": data.date },
            { "$setOnInsert": data },
            { "upsert": true },
            function(err,num,raw) {
                if ( !raw.updatedExisting ) {
                    intraDayQuote.update(
                        { "symbol.code": symbol , "day": myDay },
                        { 
                            "$setOnInsert": {
                                "symbol.name": stock.name
                            },
                            "$addToSet": { "quotes": data }
                        },
                        { "upsert": true },
                        function(err,num,raw) {

                        }
                    );
                }
            }
        );
    });
});

这真的归结为你有多重要的是将引号数据嵌套在日文件。主要的区别在于,如果您想根据数据中的某些引用字段查询这些文档,或者使用 .populate()的开销来实现来自其他集合的引号。

It really comes down to how important to you is it to have the data for quotes nested within the "day" document. The main distinction is if you want to query those documents based on the data some of those "quote" fields or otherwise live with the overhead of using .populate() to pull in the "quotes" from the other collection.

当然,如果引用和引用数据对您的查询过滤很重要,那么您始终可以只查询该集合的 _id 匹配并使用 $ in 查询day文档,仅匹配包含那些匹配的quote文档的日期。

Of course if referenced and the quote data is important to your query filtering, then you can always just query that collection for the _id values that match and use an $in query on the "day" documents to only match days that contain those matched "quote" documents.

根据应用程序使用数据的方式,最重要的是哪个路径最重要。希望这应该指导你做你想要实现的目标背后的一般概念。

It's a big decision where it matters most which path you take based on how your application uses the data. Hopefully this should guide you on the general concepts behind doing what you want to achieve.

PS除非你确定你的源数据总是一个四舍五入到一个日期确切的分钟然后你可能想要使用相同类型的日期舍入数学来获得离散的日。

P.S Unless you are "sure" that your source data is always a date rounded to an exact "minute" then you probably want to employ the same kind of date rounding math as used to get the discrete "day" as well.

这篇关于Mongoose找到一个并推送到一系列文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆