使用python更快地搜索Azure blob名称吗? [英] Faster Azure blob name search with python?

查看:83
本文介绍了使用python更快地搜索Azure blob名称吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个需要在Azure上搜索的文件名列表.现在,作为一个菜鸟,我正在遍历每个Blob名称并比较字符串,但是我认为必须有最简单,最快速的方法来完成此操作.当前的解决方案使我的HTTP响应非常慢.

I have a list of file names that I need to search on Azure. Right now as a noob I am looping over each blob names and comparing strings but I think there has to be easiest and fast way to get this done. The current solution is making my HTTP response very slow.

def ifblob_exists(self, filename):
        try:
            container_name = 'xxx'
            AZURE_KEY = 'xxx'
            SAS_KEY = 'xxx'
            ACCOUNT_NAME = 'xxx'
            block_blob_service = BlockBlobService(account_name= ACCOUNT_NAME, account_key= None, sas_token = SAS_KEY, socket_timeout= 10000)

            generator = block_blob_service.list_blobs(container_name)
            for blob in generator:
                if filename == blob.name:
                    print("\t Blob exists :"+" "+blob.name)
                    return True
                else:
                    print('Blob does not exists '+filename)
                    return False
        except Exception as e:
            print(e)

推荐答案

列出所有Blob在Azure存储基础架构中非常昂贵,因为这会转化为全面扫描.

Listing all blobs is very costly operation inside the Azure Storage infrastructure because it translates into a full scan.

在下面的示例中查找,以有效地检查blob(例如您所用的文件名)在给定容器中是否存在:

Find below an example to efficiently check if the blob (e.g. filename in your case) exists or not in a given container:

from azure.storage.blob import BlockBlobService
from datetime import datetime

def check_if_blob_exists(container_name: str, blob_names: []):
    start_time = datetime.now()

    if not container_name or container_name.isspace():
        raise ValueError("Container name cannot be none, empty or whitespace.")

    if not blob_names:
        raise ValueError("Block blob names cannot be none.")

        block_blob_service = BlockBlobService(account_name="{Storage Account Name}", account_key="{Storage Account Key}")

    for blob_name in blob_names:
        if block_blob_service.exists(container_name, blob_name):
            print("\nBlob '{0}' found!".format(blob_name));
        else:
            print("\nBlob '{0}' NOT found!".format(blob_name));

    end_time = datetime.now()

    print("\n***** Elapsed Time => {0} *****".format(end_time - start_time))

if __name__ == "__main__":
    blob_names = []

    # Exists
    blob_names.append("eula.1028.txt")
    blob_names.append("eula.1031.txt")
    blob_names.append("eula.1033.txt")
    blob_names.append("eula.1036.txt")
    blob_names.append("eula.1040.txt")

    # Don't exist
    blob_names.append("blob1")
    blob_names.append("blob2")
    blob_names.append("blob3")
    blob_names.append("blob4")

    check_if_blob_exists("containername", blob_names)

在以下屏幕快照中找到我在美国西部的笔记本电脑进行的快速执行测试的屏幕截图(根据Google速度测试,下载速度约为150 Mbps,上传速度约为3.22 Mbps),检查在美国西部的LRS存储帐户中是否存在9个斑点好吧.

Find below a screenshot of a quick execution test from my laptop from West US (~150 Mbps of Download, ~3.22 Mbps of Upload, per Google Speed Test) checking if 9 blobs exists in a LRS Storage Account in West US as well.

这篇关于使用python更快地搜索Azure blob名称吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆