如何在整个集合的 int 中转换包含字符的字符串? [英] How to convert a string with characters in the int for the entire collection?

查看:65
本文介绍了如何在整个集合的 int 中转换包含字符的字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个类似外观的集合:

I have a collection of a similar look:

_id:5d0fe0dcfd8ea94eb4633222
Category:"Stripveiling (Nederlands)"
Category url:"https://www.catawiki.nl/a/11-stripveiling-nederlands"
Lot title:"Erwin Sels (Ersel) - Originele pagina"
Seller name:"Stripwereld"
Seller country:"Nederland"
Bids count:21
Winning bid:"€ 135"
Bid amount:"Closed"
Lot image:"https://assets.catawiki.nl/assets/2011/11/17/7/4/c/74c53540-f390-012e-..."

我需要将中标"字段更改为 int.即,移除货币符号并将整个集合的字符串转换为 int.

I need to change the "Winning bid" field to a int. That is, remove the currency sign and convert from string to int for the entire collection.

在文档中没有任何地方我找不到怎么做,我真的必须用 Python 获取每个值,删除货币符号并使用方法更新来完成吗?我有将近 8,000,000 条记录,会很长.

Nowhere in the documentation I could not find how to do it, do I really have to take every value with Python, remove the currency symbol and use the method update to do it? I have almost 8,000,000 records, it will be long.

我如何使用收集方法来做到这一点?或者使用 Python 执行此操作的最快选项是什么?

How can I do this with the collection method? Or what is the quickest option to do this with Python?

推荐答案

如果要转换整个集合,可以使用聚合管道来完成.

If you want to convert the entire collection, you can do it with Aggregation pipeline.

您需要使用 $substr$toInt($toDouble$convert<将货币转换为字符串/code> 任何适合您的情况)在 $project 阶段和 $out 作为聚合的最后阶段.$out 将聚合管道的结果写入给定的集合名称.

You need to convert the currency to string using $substr and $toInt( or $toDouble, or $convert whatever suits your case) in the $project stage and $out as your last stage of aggregation. $out writes the result of the aggregtion pipeline to the given collection name.

但是在使用 $out 时要小心.根据官方 mongodb 文档:

But be careful while using $out. According to official mongodb documentation :

创建新收藏

$out 操作在当前数据库中创建一个新集合(如果一个集合不存在).这在聚合完成之前,集合是不可见的.如果聚合失败,MongoDB 不会创建集合.

The $out operation creates a new collection in the current database if one does not already exist. The collection is not visible until the aggregation completes. If the aggregation fails, MongoDB does not create the collection.

替换现有集合

如果 $out 操作指定的集合已经存在,则在完成后聚合,$out 阶段原子地替换现有的集合与新的结果集合.具体来说,$out操作:

If the collection specified by the $out operation already exists, then upon completion of the aggregation, the $out stage atomically replaces the existing collection with the new results collection. Specifically, the $out operation:

  1. 创建一个临时集合.
  2. 复制现有索引集合到临时集合.
  3. 将文档插入临时集合.
  4. 调用 db.collection.renameCollectiondropTarget: true 将临时集合重命名为目标收藏.

$out 操作不会改变存在于以前的收藏.如果聚合失败,$out 操作不更改预先存在的集合.

The $out operation does not change any indexes that existed on the previous collection. If the aggregation fails, the $out operation makes no changes to the pre-existing collection.

试试这个:

db.collection_name.aggregate([
    {
        $project: {
            category : "$category",
            category_name : "$category_name",
            lot_title : "$lot_title",
            seller_name : "$seller_name",
            seller_country : "$seller_country",
            bid_count : "$bid_count",
            winning_bid : { $toInt : {$substr : ["$winning_bid",2,-1]}},
            bid_amount : "$bid_amount",
            lot_image : "$lot_image"
        }
    },{
        $out : "collection_name"
    }
])

您可能需要使用 allowDiskUse : true 作为聚合管道的选项,因为您有大量文档,并且可能超过 16MB mongodb 限制.

you might need to use allowDiskUse : true as an option to aggregation pipeline, as you have a lots of documents, and it may surpass 16MB mongodb limit.

不要忘记用实际集合名称替换 collection_name ,并在集合中包含您需要的 $project 阶段中的所有必填字段.并且请首先使用不同的 temporary_collection 或仅通过删除 $out 阶段并检查 aggregation 管道的结果来仔细检查该值.

Don't forget to replace collection_name with actual collection name , and include all the required field in the $project stage which you need in the collection. And please double check the value first either with a different temporary_collection or just by removing the $out stage and checking the result of aggregation pipeline.

有关详细信息,请阅读官方 mongodb 文档 $out$toInt, $toDouble, $convert, $substrallowDiskUse.

For detailed information read official mongodb documentation $out, $toInt, $toDouble, $convert, $substr and allowDiskUse.

这篇关于如何在整个集合的 int 中转换包含字符的字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆