如何在Azure Cosmos DB上获得批量INSERT的延续令牌? [英] How do I get a continuation token for a bulk INSERT on Azure Cosmos DB?

查看:98
本文介绍了如何在Azure Cosmos DB上获得批量INSERT的延续令牌?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想上载一个表示10k文档的CSV文件,该文件将以快速而原子的方式添加到我的Cosmos DB集合中.我有一个类似于以下伪代码的存储过程:

I want to upload a CSV file that represents 10k documents to be added to my Cosmos DB collection in a manner that's fast and atomic. I have a stored procedure like the following pseudo-code:

function createDocsFromCSV(csv_text) {
    function parse(txt) { // ... parsing code here ... }

    var collection = getContext().getCollection();
    var response = getContext().getResponse();

    var docs_to_create = parse(csv_text);
    for(var ii=0; ii<docs_to_create.length; ii++) {
        var accepted = collection.createDocument(collection.getSelfLink(),
                                                    docs_to_create[ii],
                                                    function(err, doc_created) {
                                                        if(err) throw new Error('Error' + err.message);
                                                    });
        if(!accepted) {
            throw new Error('Timed out creating document ' + ii);
        }
    }
}

当我运行它时,存储过程在超时之前会创建大约1200个文档(因此会回滚而不创建任何文档).

When I run it, the stored procedure creates about 1200 documents before timing out (and therefore rolling back and not creating any documents).

以前,我已经成功使用连续令牌在存储过程中成功更新(而不是创建)数千个文档,并且此答案作为指导: https ://stackoverflow.com/a/34761098/277504 .但是在搜索文档后(例如 https://azure.github.io/azure-documentdb-js-server/Collection.html )我看不到像查询文档那样从创建文档中获取延续令牌的方法.

Previously I had success updating (instead of creating) thousands of documents in a stored procedure using continuation tokens and this answer as guidance: https://stackoverflow.com/a/34761098/277504. But after searching documentation (e.g. https://azure.github.io/azure-documentdb-js-server/Collection.html) I don't see a way to get continuation tokens from creating documents like I do for querying documents.

是否可以利用存储过程来创建批量文档?

Is there a way to take advantage of stored procedures for bulk document creation?

推荐答案

请务必注意,存储过程的执行受到限制,其中所有操作必须在服务器指定的请求超时时间内完成.如果操作未在该时间限制内完成,则事务将自动回滚.

It’s important to note that stored procedures have bounded execution, in which all operations must complete within the server specified request timeout duration. If an operation does not complete with that time limit, the transaction is automatically rolled back.

为了简化开发以处理时间限制,所有CRUD(创建,读取,更新和删除)操作均返回一个布尔值,该布尔值表示该操作是否将完成.此布尔值可以用作信号以结束执行并用于实现基于连续性的模型以恢复执行(这在下面的代码示例中进行了说明).有关更多详细信息,请参阅 doc

In order to simplify development to handle time limits, all CRUD (Create, Read, Update, and Delete) operations return a Boolean value that represents whether that operation will complete. This Boolean value can be used a signal to wrap up execution and for implementing a continuation based model to resume execution (this is illustrated in our code samples below). More details, please refer to the doc.

上面提供的大容量插入存储过程通过返回成功创建的文档数来实现延续模型.

The bulk-insert stored procedure provided above implements the continuation model by returning the number of documents successfully created.

伪代码:

function createDocsFromCSV(csv_text,count) {
    function parse(txt) { // ... parsing code here ... }

    var collection = getContext().getCollection();
    var response = getContext().getResponse();

    var docs_to_create = parse(csv_text);
    for(var ii=count; ii<docs_to_create.length; ii++) {
        var accepted = collection.createDocument(collection.getSelfLink(),
                                                    docs_to_create[ii],
                                                    function(err, doc_created) {
                                                        if(err) throw new Error('Error' + err.message);
                                                    });
        if(!accepted) {
            getContext().getResponse().setBody(count);
        }
    }
}

然后,您可以在客户端检查输出文档的数量,并使用count参数重新运行存储过程,以创建剩余的文档集,直到数量大于csv_text的长度为止.

Then you could check the output document count on the client side and re-run the stored procedure with the count parameter to create the remaining set of documents until the count larger than the length of csv_text.

希望它对您有帮助.

这篇关于如何在Azure Cosmos DB上获得批量INSERT的延续令牌?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆