Firebase:查询大型数据集 [英] Firebase: queries on large datasets

查看:147
本文介绍了Firebase:查询大型数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Firebase存储用户个人资料。我试图把每个用户配置文件中的最小数据量(按照关于结构化数据的文档中建议的良好实践),但是因为我有超过220K个用户配置文件,所以当以JSON形式下载所有用户配置文件时,它仍然代表150MB。
当然,它会越来越大,因为我打算有更多的用户:)

我无法查询这些用户配置文件因为每次我这样做,我达到100%的数据库I / O容量,因此一些其他的请求,由目前使用该应用程序的用户执行,结束了错误。

据我所知,使用查询时,Firebase需要考虑列表中的所有数据,并从磁盘读取所有数据。而150MB的数据似乎太多了。

那么在达到100%数据库I / O容量之前是否有一个实际的限制?在这种情况下,Firebase查询的用处是什么?
如果我只有少量的数据,我不需要查询,我可以轻松地下载所有的数据。但现在我有很多的数据,我不能再使用查询了,当我最需要他们的时候...

解决方案

这里的核心问题不是查询或数据的大小,而是数据在没有被频繁查询时加热数据到内存(即从磁盘加载)所需的时间。这可能只是一个发展问题,因为在生产中,这个查询可能是一个更常用的资产。



但是,如果目标是提高初始负载的性能,这里唯一合理的答案是查询较少的数据。 150MB是重要的。尝试通过无线网络在计算机之间复制一个150MB的文件,你会发现通过互联网发送它或者从文件服务器上载到内存中的感觉。


$ b $如果你有相当标准的搜索条件(比如你在电子邮件地址上搜索) ,您可以使用索引来存储电子邮件地址分别减少您的查询所设定的数据。

  / search_by_email / $ user_id /<电子邮件地址> 

现在,每条记录只有50k字节的存储空间, - 一个小得多的有效载荷,以温暖的记忆。



假设你正在寻找强大的搜索能力,最好的答案是使用一个真正的搜索引擎。例如,启用私人备份并导出到BigQuery,或使用ElasticSearch(例如,手电筒)。


I'm using Firebase to store user profiles. I tried to put the minimum amount of data in each user profile (following the good practices advised in the documentation about structuring data) but as I have more than 220K user profiles, it still represents 150MB when downloading as JSON all user profiles. And of course, it will grow bigger and bigger as I intend to have a lot more users :)

I can't do queries on those user profiles anymore because each time I do that, I reach 100% Database I/O capacity and thus some other requests, performed by users currently using the app, end up with errors.

I understand that when using queries, Firebase need to consider all data in the list and thus read it all from disk. And 150MB of data seems to be too much.

So is there an actual limit before reaching 100% Database I/O capacity? And what is exactly the usefulness of Firebase queries in that case? If I simply have small amounts of data, I don't really need queries, I could easily download all data. But now that I have a lot of data, I can't use queries anymore, when I need them the most...

解决方案

The core problem here isn't the query or the size of the data, it's simply the time required to warm the data into memory (i.e. load it from disk) when it's not being frequently queried. It's likely to be only a development issue, as in production this query would likely be a more frequently used asset.

But if the goal is to improve performance on initial load, the only reasonable answer here is to query on less data. 150MB is significant. Try copying a 150MB file between computers over a wireless network and you'll have some idea what it's like to send it over the internet, or to load it into memory from a file server.

A lot here depends on the use case, which you haven't included.

Assuming you have fairly standard search criteria (e.g. you search on email addresses), you can use indices to store email addresses separately to reduce the data set for your query.

/search_by_email/$user_id/<email address>

Now, rather than 50k per record, you have only the bytes to store the email address per records--a much smaller payload to warm into memory.

Assuming you're looking for robust search capabilities, the best answer is to use a real search engine. For example, enable private backups and export to BigQuery, or go with ElasticSearch (see Flashlight for an example).

这篇关于Firebase:查询大型数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆