在Firestore中删除非常大的收藏集 [英] Deleting very large collections in Firestore

查看:70
本文介绍了在Firestore中删除非常大的收藏集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在Firestore中删除非常大的收藏集.

I need to delete very large collections in Firestore.

最初,我使用客户端批量删除,但是当文档更改并开始不鼓励使用注释时

Initially I used client side batch deletes, but when the documentation changed and started to discouraged that with the comments

不建议从iOS客户端删除收藏集.

Deleting collections from an iOS client is not recommended.

不建议从Web客户端删除集合.

Deleting collections from a Web client is not recommended.

不建议从Android客户端删除集合.

Deleting collections from an Android client is not recommended.

https://firebase.google.com /docs/firestore/manage-data/delete-data?authuser = 0

我按照文档中的建议切换到了云功能.删除文档后触发云功能,然后按照"NODE.JS"部分中以上链接中的建议删除子集合中的所有文档.

I switched to a cloud function as recommended in the docs. The cloud function gets triggered when a document is deleted and then deletes all documents in a subcollection as proposed in the above link in the section on "NODE.JS".

我现在遇到的问题是云功能似乎能够每秒处理约300次删除.使用云功能的最大运行时间为9分钟,我可以通过这种方式管理多达162000个删除操作.但是我要删除的集合当前包含237560个文档,这使云功能超时大约是一半.

The problem that I am running into now is that the cloud function seems to be able to manage around 300 deletes per seconds. With the maximum runtime of a cloud function of 9 minutes I can manage up to 162000 deletes this way. But the collection I want to delete currently holds 237560 documents, which makes the cloud function timeout about half way.

我无法在父文档上使用onDelete触发器再次触发云功能,因为该文档已被删除(触发了函数的初始调用).

I cannot trigger the cloud function again with an onDelete trigger on the parent document, as this one has already been deleted (which triggered the initial call of the function).

所以我的问题是:在Firestore中删除大型集合的推荐方法是什么?根据文档,它不是客户端,而是服务器端,但是推荐的解决方案不适用于大型馆藏.

So my question is: What is the recommended way to delete large collections in Firestore? According to the docs it's not client side but server side, but the recommended solution does not scale for large collections.

谢谢!

推荐答案

当您无法在单个Cloud Function执行中执行的工作太糟糕时,您将需要找到一种方法来分担跨多个调用的工作,或在第一个之后的后续调用中继续工作.这并非易事,您必须进行一些思考和工作,为您的特定情况构建最佳解决方案.

When you have too muck work that can be performed in a single Cloud Function execution, you will need to either find a way to shard that work across multiple invocations, or continue the work in a subsequent invocations after the first. This is not trivial, and you have to put some thought and work into constructing the best solution for your particular situation.

对于分片解决方案,您将必须提前弄清楚如何拆分文档中的删除项,并让主函数启动从属函数(可能是通过pubsub),并向其传递用于确定哪个参数的参数.分片删除.例如,您可能启动了一个函数的唯一目的是删除以"a"开头的文档.另一个带有'b'的名称,依此类推,例如查询它们,然后将其删除.

For a sharding solution, you will have to figure out how to split up the document deletes ahead of time, and have your master function kick off subordinate functions (probably via pubsub), passing it the arguments to use to figure out which shard to delete. For example, you might kick off a function whose sole purpose is to delete documents that begin with 'a'. And another with 'b', etc by querying for them, then deleting them.

对于延续解决方案,您可能只是从头开始删除文档,在超时之前要花尽可能长的时间,记住您离开的地方,然后启动一个从属函数来获取上一个停止的地方.

For a continuation solution, you might just start deleting documents from the beginning, go for as long as you can before timing out, remember where you left off, then kick off a subordinate function to pick up where the prior stopped.

您应该能够使用其中一种策略来限制每个功能完成的工作量,但是实现细节完全取决于您自己制定.

You should be able to use one of these strategies to limit the amount of work done per functions, but the implementation details are entirely up to you to work out.

如果由于某些原因这些策略都不可行,则您将不得不管理自己的服务器(可能通过App Engine),并通过pubsub消息(通过pubsub)来执行一个长期运行的单元以作为响应到云功能.

If, for some reason, neither of these strategies are viable, you will have to manage your own server (perhaps via App Engine), and message (via pubsub) it to perform a single unit of long-running work in response to a Cloud Function.

这篇关于在Firestore中删除非常大的收藏集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆