MongoDB:如何使用正则表达式重命名字段 [英] MongoDB: How to rename a field using regex

查看:241
本文介绍了MongoDB:如何使用正则表达式重命名字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的文档中有一个字段,该字段以其时间戳命名,如下所示:

I have a field in my documents, that is named after its timestamp, like so:

{
    _id: ObjectId("53f2b954b55e91756c81d3a5"),
    domain: "example.com",
    "2014-08-07 01:25:08": {
        A: [
            "123.123.123.123"
        ],
        NS: [
            "ns1.example.com.",
            "ns2.example.com."
        ]
    }
}

这对于查询非常不切实际,因为每个文档都有不同的时间戳. 因此,对于所有文档,我想将此字段重命名为固定名称. 但是,我需要能够使用正则表达式来匹配字段名称,因为它们都是不同的.

This is very impractical for queries, since every document has a different timestamp. Therefore, I want to rename this field, for all documents, to a fixed name. However, I need to be able to match the field names using regex, because they are all different.

我尝试这样做,但这是一个非法查询.

I tried doing this, but this is an illegal query.

db['my_collection'].update({}, {$rename:{ /2014.*/ :"201408"}}, false, true);

有人对此问题有解决方案吗?

Does someone have a solution for this problem?

基于尼尔·伦恩的答案的解决方案:

conn = new Mongo();
db = conn.getDB("my_db");

var bulk = db['my_coll'].initializeOrderedBulkOp();
var counter = 0;

db['my_coll'].find().forEach(function(doc) {

    for (var k in doc) {
            if (k.match(/^2014.*/) ) {
                print("replacing " + k)
                var unset = {};
                unset[k] = 1;
                bulk.find({ "_id": doc._id }).updateOne({ "$unset": unset, "$set": { WK1: doc[k]} });
                counter++;
            }

    }

    if ( counter % 1000 == 0 ) {
        bulk.execute();
        bulk = db['my_coll'].initializeOrderedBulkOp();
    }

});

if ( counter % 1000 != 0 )
    bulk.execute();

推荐答案

这不是mapReduce操作,除非您想要一个仅由mapReduce输出产生的_idvalue字段组成的新集合,很像:

This is not a mapReduce operation, not unless you want a new collection that consists only of the _id and value fields that are produced from mapReduce output, much like:

    "_id": ObjectId("53f2b954b55e91756c81d3a5"), 
    "value": { 
        "domain": "example.com",
        ... 
    } 
}

充其量只是对集合的一种服务器端"重做,但是当然不在您想要的结构中.

Which at best is a kind of "server side" reworking of your collection, but of course not in the structure you want.

虽然有多种方法可以执行服务器中的所有代码,但是除非您真的有能力,否则请不要尝试执行.无论如何,这些方法通常都不适合与分片配合使用,这通常是人们因为记录的数量庞大而真正处于现场"的地方.

While there are ways to execute all of the code in the server, please don't try to do so unless you are really in a spot. These ways generally don't play well with sharding anyway, which is usually where people "really are in a spot" for the sheer size of records.

当您要更改并批量进行操作时,通常必须在访问当前文档信息的同时循环"收集结果并处理更新.也就是说,如果您的更新"是基于"文档字段或结构中已经包含的信息.

When you want to change things and do it in bulk, you generally have to "loop" the collection results and process the updates while having access to the current document information. That is, in the case where your "update" is "based on" information already contained in fields or structure of the document.

因此,没有可用的正则表达式替换"操作,并且肯定没有用于重命名字段的操作.因此,让我们循环使用批量操作,以做到安全"在服务器上全部运行代码.

There is therefore not "regex replace" operation available, and there certainly is not one for renaming a field. So let's loop with bulk operations for the "safest" form of doing this without running the code all on the server.

var bulk = db.collection.initializeOrderedBulkOp();
var counter = 0;

db.collection.find().forEach(function(doc) {

    for ( var k in doc ) {
        if ( doc[k].match(/^2014.*/) ) {
            var update = {};
            update["$unset"][k] = 1;
            update["$set"][ k.replace(/(\d+)-(\d+)-(\d+).+/,"$1$2$3") ] = doc[k];
            bulk.find({ "_id": doc._id }).updateOne(update);
            counter++;
        }
    }

    if ( counter % 1000 == 0 ) {
        bulk.execute();
        bulk = db.collection.initializeOrderedBulkOp();
    }

});

if ( counter % 1000 != 0 )
    bulk.execute();

因此,主要内容是 $unset 运算符删除现有字段和 $set 运算符可在文档中创建新字段.您需要文档内容来检查和使用字段名称"和值",因此循环是没有其他方法的.

So the main things there are the $unset operator to remove the existing field and the $set operator to create the new field in the document. You need the document content to examine and use both the "field name" and "value", so hence the looping as there is no other way.

如果服务器上没有MongoDB 2.6或更高版本,则循环概念仍然存在,不会立即带来性能优势.您可以查看 .eval() 之类的内容服务器,但是正如文档所建议的那样,实际上不建议这样做.必要时请谨慎使用.

If you don't have MongoDB 2.6 or greater on the server then the looping concept still remains without the immediate performance benefit. You can look into things like .eval() in order to process on the server, but as the documentation suggests, it really is not recommended. Use with caution if you must.

这篇关于MongoDB:如何使用正则表达式重命名字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆