获取从 Azure Blob 修改的最新文件 [英] Getting the latest file modified from Azure Blob

查看:20
本文介绍了获取从 Azure Blob 修改的最新文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我每天在 Blob 存储中生成几个 json 文件.我想要做的是在我的任何目录中修改最新的文件.所以我的 blob 中有这样的东西:

Say I am generating a couple of json files each day in my blob storage. What I want to do is to get the latest file modified in any of my directories. So I'd have something like this in my blob:

2016/01/02/test.json
2016/01/02/test2.json
2016/02/03/test.json

我想获得 2016/02/03/test.json.因此,一种方法是获取文件的完整路径并进行正则表达式检查以查找创建的最新目录,但是如果我在每个目录中有多个 josn 文件,这将不起作用.有没有像 File.GetLastWriteTime 这样的东西来获取最新修改的文​​件?我正在使用这些代码来获取所有文件 btw:

I want to get 2016/02/03/test.json. So one way is getting the full path of the file and do a regex checking to find the latest directory created, but this doesn't work if I have more than one josn file in each dir. Is there anything like File.GetLastWriteTime to get the latest modified file? I am using these codes to get all the files btw:

public static CloudBlobContainer GetBlobContainer(string accountName, string accountKey, string containerName)
{
    CloudStorageAccount storageAccount = new CloudStorageAccount(new StorageCredentials(accountName, accountKey), true);
    // blob client
    CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
    // container
    CloudBlobContainer blobContainer = blobClient.GetContainerReference(containerName);
    return blobContainer;
}

public static IEnumerable<IListBlobItem> GetBlobItems(CloudBlobContainer container)
{
    IEnumerable<IListBlobItem> items = container.ListBlobs(useFlatBlobListing: true);
    return items;
}

public static List<string> GetAllBlobFiles(IEnumerable<IListBlobItem> blobs)
{
    var listOfFileNames = new List<string>();

    foreach (var blob in blobs)
    {
        var blobFileName = blob.Uri.Segments.Last();
        listOfFileNames.Add(blobFileName);
    }
    return listOfFileNames;
}

推荐答案

每个 IListBlobItem 将是 CloudBlockBlob、CloudPageBlob 或 CloudBlobDirectory.

Each IListBlobItem is going to be a CloudBlockBlob, a CloudPageBlob, or a CloudBlobDirectory.

在转换到块或页面 blob 或它们共享的基类 CloudBlob(最好使用 as 关键字并检查是否为空)后,您可以访问修改日期通过 blockBlob.Properties.LastModified.

After casting to block or page blob, or their shared base class CloudBlob (preferably by using the as keyword and checking for null), you can access the modified date via blockBlob.Properties.LastModified.

请注意,您的实现将对容器中的所有 blob 进行 O(n) 扫描,如果有数十万个文件,这可能需要一段时间.目前还没有办法对 blob 存储进行更有效的查询(除非您滥用文件命名并将日期编码为按字母顺序排列较新的日期).实际上,如果您需要更好的查询性能,我建议您保留一个方便的数据库表,该表将所有文件列表表示为行,其中包含要搜索的索引 DateModified 列和带有 blob 路径的列以便轻松访问文件.

Note that your implementation will do an O(n) scan over all blobs in the container, which can take a while if there are hundreds of thousands of files. There's currently no way of doing a more efficient query of blob storage though, (unless you abuse the file naming and encode the date in such a way that newer dates alphabetically come first). Realistically if you need better query performance I'd recommend keeping a database table handy that represents all the file listings as rows, with things like an indexed DateModified column to search by and a column with the blob path for easy access to the file.

这篇关于获取从 Azure Blob 修改的最新文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆