查询cosmos db中的大集合 [英] Querying large collections in cosmos db

查看:81
本文介绍了查询cosmos db中的大集合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们当前在文档数据库中有一个非常大的集合. 我们希望能够基于集合中文档中的某些字段来过滤集合.

We currently have a very large collection in our document DB. We want to be able to filter the collection based on some fields in the documents in the collection.

当我通过门户网站执行此查询时,它会花费很长时间,因为其中包含大量数据. 当我通过功能应用执行此查询时,由于超时,它在五分钟后消失了.

When I perform this query via the portal it takes a really long time because there is so much data. When I perform this query via a function app, it cuts out after five minutes due to a time-out.

执行此搜索的最佳方法是什么? 是否可以通过Application Insights或某种方式执行此搜索? 我知道查询本身可能会花费很长时间,但它不应阻塞.通过门户网站查询会阻止所有其他操作.

What is the best way to perform this search? Is it possible to perform this search via Application Insights or some sort? I am aware that the query itself can take a long time but it shouldn't be blocking. Querying via the portal blocks all other actions.

先谢谢了. 问候

推荐答案

首先,您需要知道的是Document DB对Response page size施加了限制.此链接总结了其中一些限制: Azure DocumentDb存储限制-什么到底是什么意思?

Firstly, what you need to know is that Document DB imposes limits on Response page size. This link summarizes some of those limits: Azure DocumentDb Storage Limits - what exactly do they mean?

第二,如果要从Document DB查询大数据,则必须考虑查询性能问题,请参考本文:

Secondly, if you want to query large data from Document DB, you have to consider the query performance issue, please refer to this article:Tuning query performance with Azure Cosmos DB.

通过查看文档DB REST API ,您会发现几个对查询操作有重大影响的重要参数:x-ms-max-item-count, x-ms-continuation.

By looking at the Document DB REST API, you can observe several important parameters which has a significant impact on query operations : x-ms-max-item-count, x-ms-continuation.

Azure门户网站不会自动帮助您优化SQL,因此您需要在sdk或rest api中进行处理.

Azure portal doesn't automatically help you optimize your SQL so you need to handle this in the sdk or rest api.

您可以设置最大项目数并分页continuation tokens读取数据. Document Db sdk支持无缝读取分页数据.您可以参考以下python代码片段:

You could set value of Max Item Count and paginate your data using continuation tokens. The Document Db sdk supports reading paginated data seamlessly. You could refer to the snippet of python code as below:

q = client.QueryDocuments(collection_link, query, {'maxItemCount':10})
results_1 = q._fetch_function({'maxItemCount':10})
#this is a string representing a JSON object
token = results_1[1]['x-ms-continuation']
results_2 = q._fetch_function({'maxItemCount':10,'continuation':token})

希望它对您有帮助.

这篇关于查询cosmos db中的大集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆