WebBrowser,处理 pdf 加载完成 [英] WebBrowser, handle pdf load complete
问题描述
我想知道是否有人知道一种让 .pdf 文件在加载时触发就绪状态的简单方法.我正在构建一个程序来打开 url 并截取屏幕截图,然后将它们放入 excel 中.
I was wondering if anyone knew an easy way to have .pdf files trigger the readystate when loaded. I'm building a program to open url's and take screenshots, then put them in excel.
Web 浏览器将正确加载 html 文档,但在加载 .pdf
文件时卡在 While Not pageready
.浏览器控件正确呈现 .pdf
.
The web browser will load html documents correctly, but gets stuck in While Not pageready
when loading .pdf
files. The browser control correctly renders the .pdf
.
Private Sub btngo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btngo.Click
Dim file As String
Dim Obj As New Object
Dim result As String
Dim sheet As String = "sheet1"
Dim xlApp As New Excel.Application
If lblpath.Text <> "" Then
file = lblpath.Text
Dim xlWorkBook = xlApp.Workbooks.Open(file)
Dim xlWorkSheet = xlWorkBook.Worksheets(sheet)
Dim range = xlWorkSheet.UsedRange
ProgressBar1.Value = 0
For rCnt = 4 To range.Rows.Count
'url cell
Obj = CType(range.Cells(rCnt, 2), Excel.Range)
' Obj.value now contains the value in the cell..
Try
' Creates an HttpWebRequest with the specified URL.
Dim myHttpWebRequest As HttpWebRequest = CType(WebRequest.Create(Obj.value), HttpWebRequest)
' Sends the request and waits for a response.
Dim myHttpWebResponse As HttpWebResponse = CType(myHttpWebRequest.GetResponse(), HttpWebResponse)
If myHttpWebResponse.StatusCode = HttpStatusCode.OK Then
result = myHttpWebResponse.StatusCode
WebBrowser1.ScrollBarsEnabled = False
WebBrowser1.Navigate(myHttpWebRequest.RequestUri)
WaitForPageLoad()
CaptureWebBrowser(WebBrowser1)
End If
' Release the resources of the response.
myHttpWebResponse.Close()
Catch ex As WebException
result = (ex.Message)
Catch ex As Exception
result = (ex.Message)
End Try
RichTextBox1.AppendText(result & " " & Obj.value & vbNewLine)
If radpre.Checked = True Then
range.Cells(rCnt, 3).value = result
ElseIf radcob.Checked = True Then
range.Cells(rCnt, 4).value = result
ElseIf radpost.Checked = True Then
range.Cells(rCnt, 5).value = result
End If
ProgressBar1.Value = rCnt / range.Rows.Count * 100
Next
With xlApp
.DisplayAlerts = False
xlWorkBook.SaveAs(lblpath.Text.ToString)
.DisplayAlerts = True
End With
xlWorkBook.Close()
xlApp.Quit()
'reclaim memory
Marshal.ReleaseComObject(xlApp)
xlApp = Nothing
End If
End Sub
Private Function CaptureWebBrowser(ByVal wb As WebBrowser) As Image
Try
Dim hBitmap As Bitmap = New Bitmap(wb.Width, wb.Height)
wb.DrawToBitmap(hBitmap, wb.Bounds)
Dim img As Image = hBitmap
Return img
Catch ex As Exception
MessageBox.Show(ex.Message)
End Try
Return Nothing
End Function
Private Sub WaitForPageLoad()
AddHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
While Not pageready
Application.DoEvents()
End While
pageready = False
End Sub
Private Sub PageWaiter(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)
If WebBrowser1.ReadyState = WebBrowserReadyState.Complete Then
pageready = True
RemoveHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
End If
End Sub
<小时>
更新到已解决
<小时>我对反馈很满意.我真的很喜欢 Noseratio 提供的答案.我不知道在最佳实践中使用代码模式.打开 .pdf 或任何其他非基于 Web 的文档时,readyState
永远不会从 0
改变.看到这个程序只是让我在工作中不工作的一种方式,我对只捕获 .html
和 .htm
感到满意.
update to resolved
I'm very happy with the feedback. I really like like the answer Noseratio provided. I was not aware using the code pattern as not in best practices. When opening a .pdf or any other document not web based readyState
will never change from 0
. Seeing how this program is simply a way for me not to work at work, I'm satisfied with only capturing .html
and .htm
.
我的要求是
- 打开excel文档
- 解析excel文档中的链接
- 确定响应代码
- 编写响应代码,并在可能的情况下截图以提高性能
该程序解析和检索反馈的速度比我手动执行的速度要快得多..html
和 .htm
的屏幕截图提供了从生产环境成功迁移到 COB 并返回生产环境的 excel 文件证明的非技术查看器.
The program parses and retrieves feedback far faster then I would be able to do manually. Screenshots of .html
and .htm
provide non-technical viewers of the excel file proof of successful migration from production to COB, and back to production environments.
Noseratio 所述的此代码没有遵循最佳实践,也不是高质量的.这是一个快速而肮脏的实现.
This code as stated by Noseratio does not follow best practices, nor is it high quality. This is a quick and dirty implementation.
Option Infer On
Imports Microsoft.Office.Interop
Imports System.Net
Imports System.Runtime.InteropServices
Public Class Form1
Public Property pageready As Boolean
Private Sub OpenToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles OpenToolStripMenuItem.Click
OpenFileDialog1.ShowDialog()
End Sub
Private Sub OpenFileDialog1_FileOk(ByVal sender As System.Object, ByVal e As System.ComponentModel.CancelEventArgs) Handles OpenFileDialog1.FileOk
lblpath.Text = OpenFileDialog1.FileName.ToString
End Sub
Private Sub btngo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btngo.Click
Dim file As String
Dim Obj As New Object
Dim result As String
Dim sheet As String = "sheet1"
Dim xlApp As New Excel.Application
Dim img As Bitmap
Dim path As String = "C:\Documents and Settings\user\My Documents\Visual Studio 2010\Projects\COB-HTML-Tool\COB-HTML-Tool\bin\Debug\tmp.bmp"
If lblpath.Text <> "" Then
file = lblpath.Text
Dim xlWorkBook = xlApp.Workbooks.Open(file)
Dim xlWorkSheet = xlWorkBook.Worksheets(sheet)
Dim range = xlWorkSheet.UsedRange
ProgressBar1.Value = 0
For rCnt = 4 To range.Rows.Count
'url cell
Obj = CType(range.Cells(rCnt, 2), Excel.Range)
' Obj.value now contains the value in the cell..
Try
' Creates an HttpWebRequest with the specified URL.
Dim myHttpWebRequest As HttpWebRequest = CType(WebRequest.Create(Obj.value), HttpWebRequest)
' Sends the request and waits for a response.
Dim myHttpWebResponse As HttpWebResponse = CType(myHttpWebRequest.GetResponse(), HttpWebResponse)
If myHttpWebResponse.StatusCode = HttpStatusCode.OK Then
result = myHttpWebResponse.StatusCode
Dim len As Integer = myHttpWebRequest.RequestUri.ToString.Length - 4
If myHttpWebRequest.RequestUri.ToString.Substring(len) = ".htm" Or
myHttpWebRequest.RequestUri.ToString.Substring(len - 1) = ".html" Or
myHttpWebRequest.RequestUri.ToString.Substring(len) = ".asp" Then
WebBrowser1.Navigate(myHttpWebRequest.RequestUri)
WaitForPageLoad()
img = CaptureWebBrowser(WebBrowser1)
img.Save(path)
End If
End If
' Release the resources of the response.
myHttpWebResponse.Close()
Catch ex As WebException
result = (ex.Message)
Catch ex As Exception
result = (ex.Message)
End Try
RichTextBox1.AppendText(result & " " & Obj.value & vbNewLine)
If radpre.Checked = True Then
range.Cells(rCnt, 3).value = result
If img Is Nothing Then
Else
If Dir(path) <> "" Then
range.Cells(rCnt, 4).Select()
Dim opicture As Object
opicture = xlApp.ActiveSheet.Pictures.Insert(path)
opicture.ShapeRange.LockAspectRatio = True
opicture.ShapeRange.width = 170
opicture.ShapeRange.height = 170
My.Computer.FileSystem.DeleteFile(path)
End If
End If
ElseIf radcob.Checked = True Then
range.Cells(rCnt, 5).value = result
If img Is Nothing Then
Else
If Dir(path) <> "" Then
range.Cells(rCnt, 6).Select()
Dim opicture As Object
opicture = xlApp.ActiveSheet.Pictures.Insert(path)
opicture.ShapeRange.LockAspectRatio = True
opicture.ShapeRange.width = 170
opicture.ShapeRange.height = 170
My.Computer.FileSystem.DeleteFile(path)
End If
End If
ElseIf radpost.Checked = True Then
range.Cells(rCnt, 7).value = result
If img Is Nothing Then
Else
If Dir(path) <> "" Then
range.Cells(rCnt, 8).Select()
Dim opicture As Object
opicture = xlApp.ActiveSheet.Pictures.Insert(path)
opicture.ShapeRange.LockAspectRatio = True
opicture.ShapeRange.width = 170
opicture.ShapeRange.height = 170
My.Computer.FileSystem.DeleteFile(path)
End If
End If
End If
ProgressBar1.Value = rCnt / range.Rows.Count * 100
Next
With xlApp
.DisplayAlerts = False
xlWorkBook.SaveAs(lblpath.Text.ToString)
.DisplayAlerts = True
End With
xlWorkBook.Close()
xlApp.Quit()
'reclaim memory
Marshal.ReleaseComObject(xlApp)
xlApp = Nothing
End If
End Sub
Private Function CaptureWebBrowser(ByVal wb As WebBrowser) As Image
Try
wb.ScrollBarsEnabled = False
Dim hBitmap As Bitmap = New Bitmap(wb.Width, wb.Height)
wb.DrawToBitmap(hBitmap, wb.Bounds)
Dim img As Image = hBitmap
Return img
Catch ex As Exception
MessageBox.Show(ex.Message)
End Try
Return Nothing
End Function
Private Sub WaitForPageLoad()
AddHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
While Not pageready
Application.DoEvents()
System.Threading.Thread.Sleep(200)
End While
pageready = False
End Sub
Private Sub PageWaiter(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)
If WebBrowser1.ReadyState = WebBrowserReadyState.Complete Then
pageready = True
RemoveHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
End If
End Sub
End Class
推荐答案
很遗憾,您将无法使用 webBrowser.DrawToBitmap
获取 PDF 视图的快照.在撰写本文时,Adobe Acrobat Reader ActiveX 控件不支持在自定义设备上下文上呈现,因此该方法不起作用,以及发送 WM_PRINT
或通过 WebBrowser 直接在 Reader ActiveX 对象上调用 IViewObject::Draw
(我试过了,我并不孤单).正确的解决方案是使用第 3 方 PDF 渲染组件.
Unfortunately, you won't be able to use webBrowser.DrawToBitmap
to get a snapshot of the PDF view. At the time of writing this, Adobe Acrobat Reader ActiveX control doesn't support rendering on a custom device context, so this method won't work, as well as sending WM_PRINT
or calling IViewObject::Draw
, either directly on the Reader ActiveX object on via WebBrowser (I tried that, and I'm not alone). The proper solution would be to use a 3rd party PDF rendering component.
顺便说一句,你应该避免使用这样的代码模式:
On a side note, you should avoid using code pattern like this:
While Not pageready
Application.DoEvents()
End While
这是一个忙等待紧密循环,徒劳地消耗CPU周期.至少,将一些 Thread.Sleep(200)
放入循环中,但总的来说你应该避免 也使用 Application.DoEvents.
It's a busy waiting tight loop, consuming CPU cycles in vain. At least, put some Thread.Sleep(200)
inside the loop, but overall you should avoid using Application.DoEvents too.
这篇关于WebBrowser,处理 pdf 加载完成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!