打开Azure StorageStreamDownloader而不将其另存为文件 [英] Open an Azure StorageStreamDownloader without saving it as a file
问题描述
我需要从azure的blob容器中下载PDF作为下载流(StorageStreamDownloader),并在PDFPlumber和PDFminer中将其打开. 我开发了将它们加载为文件的所有要求,但是我无法设法接收到下载流(StorageStreamDownloader)并成功打开它. 我是这样打开PDF的:
I need to download a PDF from a blob container in azure as a download stream (StorageStreamDownloader) and open it in both PDFPlumber and PDFminer. I developed all the requirements loading them as a file, but I cant manage to received a download stream (StorageStreamDownloader) and open it successfully. I was opening the PDFs like this:
pdf = pdfplumber.open(pdfpath) //for pdfplumber
fp = open('Pdf/' + fileGlob, 'rb') // for pdfminer
parser = PDFParser(fp)
document = PDFDocument(parser)
但是,我需要能够下载流.将pdf下载为文件的代码段:
However, i need to be able to download a stream. Code snippet that downloads the pdf as a file:
blob_client = container.get_blob_client(remote_file)
with open(local_file_path,"wb") as local_file:
download_stream = blob_client.download_blob()
local_file.write(download_stream.readall())
local_file.close()
我尝试了几种选择,即使使用没有运气的临时文件也是如此. 有什么想法吗?
I tried several options, even using a temp file with no luck. Any ideas?
推荐答案
download_blob()
将blob下载到StorageStreamDownloader
类,并且在该类中有一个download_to_stream
,由此您将获得blob流.
download_blob()
download the blob to a StorageStreamDownloader
class, and in this class there is a download_to_stream
, with this you will get the blob stream.
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
from io import BytesIO
import PyPDF2
filename = "test.pdf"
container_name="test"
blob_service_client = BlobServiceClient.from_connection_string("connection string")
container_client=blob_service_client.get_container_client(container_name)
blob_client = container_client.get_blob_client(filename)
streamdownloader=blob_client.download_blob()
stream = BytesIO()
streamdownloader.download_to_stream(stream)
fileReader = PyPDF2.PdfFileReader(stream)
print(fileReader.numPages)
这是我的结果.它将打印pdf页号.
And this is my result. It will print the pdf pages number.
这篇关于打开Azure StorageStreamDownloader而不将其另存为文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!