如何清除Firebase实时数据库中的旧内容 [英] How to purge old content in firebase realtime database

查看:125
本文介绍了如何清除Firebase实时数据库中的旧内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Firebase实时数据库,并且超时了,其中有很多过时的数据,并且我编写了一个脚本来删除过时的内容.

I am using Firebase realtime database and overtime there is a lot of stale data in it and I have written a script to delete the stale content.

我的Node结构看起来像这样:

My Node structure looks something like this:

store
  - {store_name}
    - products
      - {product_name}
        - data
          - {date} e.g. 01_Sep_2017
            - some_event

数据规模

#Stores: ~110K
#Products: ~25

上下文

我想清理所有已使用30个月的数据.我尝试了以下方法:-

Context

I want to cleanup all the data which is like 30 months old. I tried the following approach :-

对于每个商店,遍历所有产品,并在每个日期删除节点

For each store, traverse all the products and for each date, delete the node

我运行了约30个线程/脚本实例,每个线程负责删除该月的特定数据日期.整个脚本运行了大约12个小时,才能删除具有上述结构的一个月数据.

I ran ~30 threads/script instances and each thread is responsible for deleting a particular date of data in that month. The whole script is running for ~12 hours to delete a month data with above structure.

我对每个脚本中的未决调用数设置了限制/上限,从日志记录中可以明显看出,每个脚本都非常快地达到了限制,并且触发delete调用的速度比删除速度快得多.成为瓶颈.

I have placed a limit/cap on the number of pending calls in each script and it is evident from logging that each script reaches the limit very quickly and speed of firing the delete call is much faster than speed of deletion So here firebase becomes a bottleneck.

很明显,我在客户端运行清除脚本,并且为了获得性能脚本,应在数据附近执行该脚本,以节省网络往返时间.

Pretty evident that I am running purge script at client side and to gain performance script should be executed close to the data to save network round trip time.

Q1.如何有效删除Firebase旧节点?

Q1. How to delete firebase old nodes efficiently ?

Q2.有没有一种方法可以在每个节点上设置一个TTL,以便它自动清除?

Q2. Is there a way we can set a TTL on each node so that it cleans up automatically ?

Q3.我已从多个节点确认已从节点删除数据,但Firebase控制台未显示数据减少.我还尝试进行数据备份,当我手动检查节点时,它仍显示一些不存在的数据.我想知道这种不一致的原因.

Q3. I have confirmed from multiple nodes that data has been deleted from the nodes but firebase console is not showing decrease in data. I also tried to take backup of data and it still is showing some data which is not there when I checked the nodes manually. I want to know the reason behind this inconsistency.

firebase会进行软删除吗?因此,当我们进行备份时,数据实际上在那里,但是通过firebase sdk或firebase控制台不可见,因为它们可以处理软删除,但是备份却不行?

Does firebase make soft deletions So when we take backups, data is actually there but is not visible via firebase sdk or firebase console because they can process soft deletes but backups don't ?

Q4.在脚本运行的整个过程中,带宽部分一直在不断增加.使用以下脚本,我仅触发删除调用,并且仍未读取任何数据,但与数据库读取保持一致.看看这个屏幕截图吗?

Q4. For the whole duration my script is running, I have a continuous rise in bandwidth section. With below script I am only firing delete calls and I am not reading any data still I see a consistency with database read. Have a look at this screenshot ?

这是因为已删除节点的回调吗?

Is this because of callbacks of deleted nodes ?

var stores = [];
var storeIndex = 0;
var products = [];
var productIndex = -1;

const month = 'Oct';
const year = 2017;

if (process.argv.length < 3) {
  console.log("Usage: node purge.js $beginDate $endDate i.e. node purge 1 2 | Exiting..");
  process.exit();
}

var beginDate = process.argv[2];
var endDate = process.argv[3];

var numPendingCalls = 0;

const maxPendingCalls = 500;

/**
 * Url Pattern: /store/{domain}/products/{product_name}/data/{date}
 * date Pattern: 01_Jan_2017
 */
function deleteNode() {
  var storeName = stores[storeIndex],
    productName = products[productIndex],
    date = (beginDate < 10 ? '0' + beginDate : beginDate) + '_' + month + '_' + year;

  numPendingCalls++;

  db.ref('store')
    .child(storeName)
    .child('products')
    .child(productName)
    .child('data')
    .child(date)
    .remove(function() {
      numPendingCalls--;
    });
}

function deleteData() {
  productIndex++;

  // When all products for a particular store are complete, start for the new store for given date
  if (productIndex === products.length) {
    if (storeIndex % 1000 === 0) {
      console.log('Script: ' + beginDate, 'PendingCalls: ' + numPendingCalls, 'StoreIndex: ' + storeIndex, 'Store: ' + stores[storeIndex], 'Time: ' + (new Date()).toString());
    }

    productIndex = 0;
    storeIndex++;
  }

  // When all stores have been completed, start deleting for next date
  if (storeIndex === stores.length) {
    console.log('Script: ' + beginDate, 'Successfully deleted data for date: ' + beginDate + '_' + month + '_' + year + '. Time: ' + (new Date()).toString());
    beginDate++;
    storeIndex = 0;
  }

  // When you have reached endDate, all data has been deleted call the original callback
  if (beginDate > endDate) {
    console.log('Script: ' + beginDate, 'Deletion script finished successfully at: ' + (new Date()).toString());
    process.exit();
    return;
  }

  deleteNode();
}

function init() {
  console.log('Script: ' + beginDate, 'Deletion script started at: ' + (new Date()).toString());

  getStoreNames(function() {
    getProductNames(function() {
      setInterval(function() {
        if (numPendingCalls < maxPendingCalls) {
          deleteData();
        }
      }, 0);
    });
  });
}

PS:这不是我所拥有的确切结构,但是它与我们拥有的结构非常相似(我已经更改了节点名称,并试图使该示例成为现实的示例)

PS: This is not the exact structure I have but it is very similar to what we have (I have changed the node names and tried to make the example a realistic example)

推荐答案

  1. 能否更有效地执行删除操作取决于您现在执行删除操作的方式.由于您没有共享可重现当前行为的最小代码,因此很难说出如何加以改善.

  1. Whether the deletes can be done more efficiently depends on how you now do them. Since you didn't share the minimal code that reproduces your current behavior it's hard to say how to improve it.

不支持文档的生存时间属性.通常,开发人员在定期运行的管理程序/脚本中进行清理.您运行清理脚本的频率越高,其工作量就越少,因此运行速度就越快.

There is no support for a time-to-live property on documents. Typically developers do the clean-up in a administrative program/script that runs periodically. The more frequently you run the cleanup script, the less work it has to do, and thus the faster it will be.

另请参阅:

  • Delete firebase data older than 2 hours
  • How to delete firebase data after "n" days

Firebase时,它实际上会从磁盘上删除数据.由于它实际上已经消失了,因此无法通过API来检索它.但是,如果您有前一天的备份,数据当然仍然会存在.

Firebase actually deletes the data from disk when you tell it to. There is no way through the API to retrieve it, since it is really gone. But if you have a backup from a previous day, the data will of course still be there.

这篇关于如何清除Firebase实时数据库中的旧内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆