如何提高通过流下载大尺寸天蓝色blob文件的性能? [英] How to improve performance of downloading large size azure blob file over a stream?
问题描述
我有大约212 MB的JSON blob文件。
在本地调试时需要大约15分钟的下载时间。
当我将代码部署到Azure应用程序服务时,它将运行10分钟并失败并显示错误:(在本地它间歇性地失败
服务器无法验证请求。确保
授权标头的值正确构成,包括签名
代码尝试1:
//创建用于在5分钟内引用文件的SAS令牌
SharedAccessBlobPolicy sasConstraints = new SharedAccessBlobPolicy
{
SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(15),
权限= SharedAccessBlobPermissions.Read
};
var blob = cloudBlobContainer.GetBlockBlobReference(blobFilePath);
字符串sasContainerToken = blob.GetSharedAccessSignature(sasConstraints);
var cloudBlockBlob = new CloudBlockBlob(new Uri(blob.Uri + sasContainerToken));
使用(var stream = new MemoryStream())
{
等待cloudBlockBlob.DownloadToStreamAsync(stream);
//将流的位置重置为0
stream.Position = 0;
var serializer = new JsonSerializer();
使用(var sr = new StreamReader(stream))
{
using(var jsonTextReader = new JsonTextReader(sr))
{
jsonTextReader。 SupportMultipleContent = true;
结果=新List< T>();
while(jsonTextReader.Read())
{
result.Add(serializer.Deserialize< T>(jsonTextReader));;
}
}
}
}
代码尝试2:我尝试使用DownloadRangeToStreamAsync来下载块中的Blob,但未进行任何更改:
int bufferLength = 1 * 1024 * 1024 ; // 1 MB块
long blobRemainingLength = blob.Properties.Length;
Queue< KeyValuePair< long,long>> queues = new Queue< KeyValuePair< long,long>>();
长偏移量= 0;
do
{
long chunkLength =(long)Math.Min(bufferLength,blobRemainingLength);
偏移量+ = chunkLength;
blobRemainingLength-= chunkLength;
使用(var ms = new MemoryStream())
{
等待blob.DownloadRangeToStreamAsync(ms,offset,chunkLength);
ms.Position = 0;
锁(outPutStream)
{
outPutStream.Position =偏移量;
var bytes = ms.ToArray();
outPutStream.Write(bytes,0,bytes.Length);
}
}
}
而(blobRemainingLength> 0);
我认为212 MB数据不是一个很大的JSON文件。你能建议一个
解决方案吗?
我建议您可以尝试使用
I have JSON blob file of size around 212 MB.
On Local while debugging it is taking around 15 minutes to download.
When i deploy code to Azure app service it runs for 10 minutes and fails with error : (locally it fails intermittently with same error)
Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature
Code Attempt 1:
// Create SAS Token for referencing a file for a duration of 5 min
SharedAccessBlobPolicy sasConstraints = new SharedAccessBlobPolicy
{
SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(15),
Permissions = SharedAccessBlobPermissions.Read
};
var blob = cloudBlobContainer.GetBlockBlobReference(blobFilePath);
string sasContainerToken = blob.GetSharedAccessSignature(sasConstraints);
var cloudBlockBlob = new CloudBlockBlob(new Uri(blob.Uri + sasContainerToken));
using (var stream = new MemoryStream())
{
await cloudBlockBlob.DownloadToStreamAsync(stream);
//resetting stream's position to 0
stream.Position = 0;
var serializer = new JsonSerializer();
using (var sr = new StreamReader(stream))
{
using (var jsonTextReader = new JsonTextReader(sr))
{
jsonTextReader.SupportMultipleContent = true;
result = new List<T>();
while (jsonTextReader.Read())
{
result.Add(serializer.Deserialize<T>(jsonTextReader));
}
}
}
}
Code Attempt 2: I have tried using DownloadRangeToStreamAsync for downloading a blob in chunk but nothing changed :
int bufferLength = 1 * 1024 * 1024;//1 MB chunk
long blobRemainingLength = blob.Properties.Length;
Queue<KeyValuePair<long, long>> queues = new Queue<KeyValuePair<long, long>>();
long offset = 0;
do
{
long chunkLength = (long)Math.Min(bufferLength, blobRemainingLength);
offset += chunkLength;
blobRemainingLength -= chunkLength;
using (var ms = new MemoryStream())
{
await blob.DownloadRangeToStreamAsync(ms, offset, chunkLength);
ms.Position = 0;
lock (outPutStream)
{
outPutStream.Position = offset;
var bytes = ms.ToArray();
outPutStream.Write(bytes, 0, bytes.Length);
}
}
}
while (blobRemainingLength > 0);
I think 212 MB data is not a large JSON file. Can you please suggest a solution ?
I suggest you can give it a try by using Azure Storage Data Movement Library.
I tested with a larger file of 220MB size, it takes about 5 minutes to download it into memory.
The sample code:
SharedAccessBlobPolicy sasConstraints = new SharedAccessBlobPolicy
{
SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(15),
Permissions = SharedAccessBlobPermissions.Read
};
CloudBlockBlob blob = blobContainer.GetBlockBlobReference("t100.txt");
string sasContainerToken = blob.GetSharedAccessSignature(sasConstraints);
var cloudBlockBlob = new CloudBlockBlob(new Uri(blob.Uri + sasContainerToken));
var stream = new MemoryStream();
//set this value as per your need
TransferManager.Configurations.ParallelOperations = 5;
Console.WriteLine("begin to download...");
//use Stopwatch to calculate the time
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
DownloadOptions options = new DownloadOptions();
options.DisableContentMD5Validation = true;
//use these lines of code just for checking the downloading progress, you can remove it in your code.
SingleTransferContext context = new SingleTransferContext();
context.ProgressHandler = new Progress<TransferStatus>((progress) =>
{
Console.WriteLine("Bytes downloaded: {0}", progress.BytesTransferred);
});
var task = TransferManager.DownloadAsync(cloudBlockBlob, stream,options,context);
task.Wait();
stopwatch.Stop();
Console.WriteLine("the length of the stream is: "+stream.Length);
Console.WriteLine("the time is taken: "+stopwatch.ElapsedMilliseconds);
The test result:
这篇关于如何提高通过流下载大尺寸天蓝色blob文件的性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!