在 mongoDB 的同一个 API 中并行查询同一个文档 [英] Querying same document in parallel in the same API in mongoDB

查看:51
本文介绍了在 mongoDB 的同一个 API 中并行查询同一个文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用 typescript 编写的 API,我尝试使用 promise.allsettled 对同一文档运行并行查询,但是它的性能更差,我猜它们是按顺序运行的.有没有办法在 mongoDB 的同一连接中对同一文档执行并行查询.这是代码:

I have a an API written in typescript and I try to run parallel queries for same document by using promise.allsettled however it performs worse and I guess they run sequentially. Is there a way to perform parallel queries on the same document in the same connection for mongoDB. here is the code:

console.time("normal");
let normal = await ContentRepo.geBySkillIdWithSourceFiltered(
    [chosenSkillsArr[0].sid!],
    readContentIds,
    body.isVideoIncluded,
    true,
    true
);
console.timeEnd("normal");

console.time("parallel");
const parallel = await Promise.allSettled(
    chosenSkillsArr.map(async (skill: IScrapeSkillDocument) => {
        const result = await ContentRepo.geBySkillIdWithSourceFiltered(
            [skill.sid!],
            readContentIds,
            body.isVideoIncluded,
            true,
            true
        );
    })
);
console.timeEnd("parallel");

我调用的函数在这里:

async geBySkillIdWithSourceFiltered(
    skillIds: string[],
    contentIds: string[],
    isVideoIncluded?: boolean,
    isCuratorIdFilter?: boolean,
    activeSourceFilter?: boolean
): Promise<IContentWithSource[]> {
    try {
        console.time(`single-${skillIds}`);
        var contents = await ContentM.find({
            $and: [
                { "skills.skillId": { $in: skillIds } },
                { recordStatus: true },
                isCuratorIdFilter ? { curatorId: 0 } : {},
                isVideoIncluded ? {} : { type: contentTypeNumber.read },
                { _id: { $nin: contentIds } },
            ],
        }).exec();
        var items: IContentWithSource[] = [];
        var sourceIds = new Set<string>();
        contents.forEach((content) => {
            if (!this.isEmpty(content.sourceId)) {
                sourceIds.add(content.sourceId!);
            }
        });
        var sources: any = {};
        var sourcesArr = await new SourceRepo().getByIds(
            Array.from(sourceIds)
        );
        sourcesArr.forEach((source) => {
            sources[source._id] = source;
        });

        if (activeSourceFilter) {
            contents
                .map((i) => i.toJSON() as IContentWithSource)
                .map((k) => {
                    if (sources[k.sourceId!].isActive) {
                        k.source = sources[k.sourceId!];
                        items.push(k);
                    }
                });
        } else {
            contents
                .map((i) => i.toJSON() as IContentWithSource)
                .map((k) => {
                    k.source = sources[k.sourceId!];
                    items.push(k);
                });
        }
        console.timeEnd(`single-${skillIds}`);

        return items;
    } catch (err) {
        throw err;
    }
}

结果是:

single-KS120B874P2P6BK1MQ0T: 1872.735ms
normal: 1873.934ms
single-KS120B874P2P6BK1MQ0T: 3369.925ms
single-KS440QS66YCBN23Y8K25: 3721.214ms
single-KS1226Y6DNDT05G7FJ4J: 3799.050ms
parallel: 3800.586ms

推荐答案

看来你在并行版本中运行了更多代码

It seems like you are running more code in the parallel version

// The normal version
let normal = await ContentRepo.geBySkillIdWithSourceFiltered(
    [chosenSkillsArr[0].sid!],
    readContentIds,
    body.isVideoIncluded,
    true,
    true
);


// The code inside the parallel version:
chosenSkillsArr.map(async (skill: IScrapeSkillDocument) => {
        const result = await ContentRepo.geBySkillIdWithSourceFiltered(
            [skill.sid!],
            readContentIds,
            body.isVideoIncluded,
            true,
            true
        );
    })

[chosenSkillsArr[0].sid!], vs  chosenSkillsArr.map()

对于并行版本,您将函数调用 (ContentRepo.geBySkillIdWithSourceFiltered) 放入循环中.这就是它变慢的原因.

For the parallel version, you are putting the function call (ContentRepo.geBySkillIdWithSourceFiltered) inside a loop. That's why it is slower.

Promise.all 一样,Promise.allSettled 等待多个promise.它不关心它们解决什么顺序,或者计算是否并行运行.它们都不保证并发性,也不保证相反.他们的任务只是确保传递给它的所有承诺都得到处理.

Like Promise.all, Promise.allSettled await multiple promises. It doesn't care about what order they resolve, or whether the computations are running in parallel. They both do not guarantee concurrency nor the opposite. Their task is just to ensure all the promises passed to it are handled.

所以不能手动保证promise执行的并行度

So you can't manually guarantee the parallelism of promise execution

这是一篇非常有趣的文章,解释了并行性和<代码>Promise.All 以及浏览器 Nodejs API 在并行性方面与计算机上安装的 Nodejs API 有何不同.

Here is a really interesting article explaining parallelism and Promise.All and how browser Nodejs API differs from Nodejs API installed on your computer in terms of parallelism.

以下是文章结论的摘录:

Here is the extract of the article's conclusion:

JavaScript 运行时是单线程的.我们无法访问 JavaScript 中的线程.即使您拥有多核 CPU,您仍然无法使用 JavaScript 并行运行任务.但是,浏览器/NodeJS 使用 C/C++ (!) 来访问线程.因此,它们可以实现并行性.

JavaScript runtime is single-threaded. We do not have access to thread in JavaScript. Even if you have multi-core CPU you still can't run tasks in parallel using JavaScript. But, the browser/NodeJS uses C/C++ (!) where they have access to thread. So, they can achieve parallelism.

旁注:

有一个细微的区别:

Side Note:

There is one subtle difference:

  1. Promise.all:仅当传递给它的所有承诺都解决时才解决,否则它将因第一个被拒绝的承诺错误而拒绝.

  1. Promise.all: Resolves only when all promises passed to it resolves else it will reject with the first rejected promise error.

Promise.allSettled:将始终使用包含有关已解决和拒绝的承诺的信息的数组来解决.

Promise.allSettled: Will always get resolved with an array having info about resolved and rejected promises.

这篇关于在 mongoDB 的同一个 API 中并行查询同一个文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆