下载嵌入式PDF文件 [英] Download Embedded PDF File
问题描述
问题:如何下载嵌入Excel的PDF文件?
Question: How do I download a PDF file which is embedded in Excel?
这个问题已经问了很多遍了,但是我在任何地方都没有看到一个有效的答案.
This question has been asked so many times but I have not seen a single working answer anywhere.
因此,这是一种尝试自我回答问题的尝试.该代码有效,并且不依赖于不可靠的.Verb Verb:=xlPrimary
方法.
So here is an attempt to self answer the question. This code works and is not dependent on the unreliable .Verb Verb:=xlPrimary
method.
推荐答案
注意:这仅适用于pdf文件.如果混合使用嵌入式文件,则将无法正常工作.
Note: This will only work for pdf files. If there is a mix of embedded files then this will not work.
基本准备工作
-
假设我们的Excel文件
C:\Users\routs\Desktop\Sample.xlsx
嵌入了2个Pdf文件,如下所示.
Let's say our Excel File
C:\Users\routs\Desktop\Sample.xlsx
has 2 Pdf Files embedded as shown below.
出于测试目的,我们将在桌面C:\Users\routs\Desktop\Temp
上创建一个临时文件夹.
For testing purpose, we will create a temp folder on our desktop C:\Users\routs\Desktop\Temp
.
逻辑:
- Excel文件本质上只是一个.zip文件
-
Excel将
oleObjects
保存在\xl\embeddings\
文件夹中.如果将Excel文件重命名为zip并用Winzip格式打开,则可以看到以下内容
- The Excel file is essentially just a .zip file
Excel saves the
oleObjects
in the\xl\embeddings\
folder. If you rename the Excel file to zip and open it in say Winzip, you can see the following
如果解压缩bin文件并将其重命名为pdf,则可以在Microsoft Edge
中打开pdf,但不能在任何其他pdf查看器中打开.为了使其与任何其他pdf查看器兼容,我们将必须进行一些Binary
读取和编辑.
If you extract the bin files and rename it to pdf then you will be able to open the pdf in Microsoft Edge
but not in any other pdf viewer. To make it compatible with any other pdf viewer, we will have to do some Binary
reading and editing.
如果在任何十六进制编辑器中打开bin文件,则将看到以下内容.我使用了在线十六进制编辑器 https://hexed.it/
If you open the bin file in any Hex Editor, you will see the below. I used the online hex editor https://hexed.it/
我们必须删除%PDF
我们将尝试找到%PDF
...的8位无符号值,或更具体地说,找到%
,P
,D
和F
We will try and find the 8 bit unsigned values of %PDF
... Or more specifically of %
, P
, D
and F
如果您在十六进制编辑器中向下滚动,将获得这四个值
If you scroll down in the hex editor, you will get those four values
%
的值
Value of %
P
的值
Value of P
D
的值
Value of D
F
的值
Value of F
现在我们要做的就是读取二进制文件并删除%PDF
之前的所有内容,并以.Pdf
扩展名保存文件.
Now all we have to do is read the binary file and delete everything before %PDF
and save the file with a .Pdf
extention.
代码:
Option Explicit
Const TmpPath As String = "C:\Users\routs\Desktop\Temp"
Const ExcelFile As String = "C:\Users\routs\Desktop\Sample.xlsx"
Const ZipName As String = "C:\Users\routs\Desktop\Sample.zip"
Sub ExtractPDF()
Dim tmpPdf As String
Dim oApp As Object
Dim i As Long
'~~> Deleting any previously created files. This is
'~~> usually helpful from 2nd run onwards
On Error Resume Next
Kill ZipName
Kill TmpPath & "\*.*"
On Error GoTo 0
'~~> Copy and rename the Excel file as zip file
FileCopy ExcelFile, ZipName
Set oApp = CreateObject("Shell.Application")
'~~> Extract the bin file from xl\embeddings\
For i = 1 To oApp.Namespace(ZipName).items.Count
oApp.Namespace(TmpPath).CopyHere oApp.Namespace(ZipName).items.Item("xl\embeddings\oleObject" & i & ".bin")
tmpPdf = TmpPath & "\oleObject" & i & ".bin"
'~~> Read and Edit the Bin File
If Dir(tmpPdf) <> "" Then ReadAndWriteExtractedBinFile tmpPdf
Next i
MsgBox "Done"
End Sub
'~~> Read and ReWrite Bin File
Sub ReadAndWriteExtractedBinFile(s As String)
Dim intFileNum As Long, bytTemp As Byte
Dim MyAr() As Long, NewAr() As Long
Dim fileName As String
Dim i As Long, j As Long, k As Long
j = 1
intFileNum = FreeFile
'~~> Open the bing file
Open s For Binary Access Read As intFileNum
'~~> Get the number of lines in the bin file
Do While Not EOF(intFileNum)
Get intFileNum, , bytTemp
j = j + 1
Loop
'~~> Create an array to store the filtered results of the bin file
'~~> We will use this to recreate the bin file
ReDim MyAr(1 To j)
j = 1
'~~> Go to first record
If EOF(intFileNum) Then Seek intFileNum, 1
'~~> Store the contents of bin file in an array
Do While Not EOF(intFileNum)
Get intFileNum, , bytTemp
MyAr(j) = bytTemp
j = j + 1
Loop
Close intFileNum
'~~> Check for the #PDF and Filter out rest of the data
For i = LBound(MyAr) To UBound(MyAr)
If i = UBound(MyAr) - 4 Then Exit For
If Val(MyAr(i)) = 37 And Val(MyAr(i + 1)) = 80 And _
Val(MyAr(i + 2)) = 68 And Val(MyAr(i + 3)) = 70 Then
ReDim NewAr(1 To j - i + 2)
k = 1
For j = i To UBound(MyAr)
NewAr(k) = MyAr(j)
k = k + 1
Next j
Exit For
End If
Next i
intFileNum = FreeFile
'~~> Decide on the new name of the pdf file
'~~> Format(Now, "ddmmyyhhmmss") This method will awlays ensure that
'~~> you will get a unique filename
fileName = TmpPath & "\" & Format(Now, "ddmmyyhhmmss") & ".pdf"
'~~> Write the new binary file
Open fileName For Binary Lock Read Write As #intFileNum
For i = LBound(NewAr) To UBound(NewAr)
Put #intFileNum, , CByte(NewAr(i))
Next i
Close #intFileNum
End Sub
输出
这篇关于下载嵌入式PDF文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!