打开Azure StorageStreamDownloader而不将其另存为文件 [英] Open an Azure StorageStreamDownloader without saving it as a file

查看：76 发布时间：2020/9/16 2:13:44 python azure

本文介绍了打开Azure StorageStreamDownloader而不将其另存为文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要从azure的blob容器中下载PDF作为下载流(StorageStreamDownloader)，并在PDFPlumber和PDFminer中将其打开. 我开发了将它们加载为文件的所有要求，但是我无法设法接收到下载流(StorageStreamDownloader)并成功打开它. 我是这样打开PDF的:

I need to download a PDF from a blob container in azure as a download stream (StorageStreamDownloader) and open it in both PDFPlumber and PDFminer. I developed all the requirements loading them as a file, but I cant manage to received a download stream (StorageStreamDownloader) and open it successfully. I was opening the PDFs like this:

pdf = pdfplumber.open(pdfpath) //for pdfplumber
fp = open('Pdf/' + fileGlob, 'rb')  // for pdfminer
parser = PDFParser(fp) 
document = PDFDocument(parser)

但是，我需要能够下载流.将pdf下载为文件的代码段:

However, i need to be able to download a stream. Code snippet that downloads the pdf as a file:

blob_client = container.get_blob_client(remote_file)
with open(local_file_path,"wb") as local_file:
    download_stream = blob_client.download_blob()
    local_file.write(download_stream.readall())
    local_file.close()

我尝试了几种选择，即使使用没有运气的临时文件也是如此. 有什么想法吗?

I tried several options, even using a temp file with no luck. Any ideas?

推荐答案

download_blob()将blob下载到StorageStreamDownloader类，并且在该类中有一个download_to_stream，由此您将获得blob流.

download_blob() download the blob to a StorageStreamDownloader class, and in this class there is a download_to_stream, with this you will get the blob stream.

from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
from io import BytesIO
import PyPDF2
filename = "test.pdf"

container_name="test"

blob_service_client = BlobServiceClient.from_connection_string("connection string")
container_client=blob_service_client.get_container_client(container_name)
blob_client = container_client.get_blob_client(filename)
streamdownloader=blob_client.download_blob()

stream = BytesIO()
streamdownloader.download_to_stream(stream)

fileReader = PyPDF2.PdfFileReader(stream)

print(fileReader.numPages)

这是我的结果.它将打印pdf页号.

And this is my result. It will print the pdf pages number.

这篇关于打开Azure StorageStreamDownloader而不将其另存为文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

打开Azure StorageStreamDownloader而不将其另存为文件 [英] Open an Azure StorageStreamDownloader without saving it as a file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

打开Azure StorageStreamDownloader而不将其另存为文件 [英] Open an Azure StorageStreamDownloader without saving it as a file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭