将大字符串缩小为特定段落 [英] Scaling down a large string to a specific paragraph
问题描述
大家好,
我正在制作一个''windows forms''项目,要求我从网站上获取一些文字并在程序中显示。
下面有我的公共子,它基本上读取了相关网站的源代码,将其转换为字节中的字符串,并将其显示在多行文本框中表单。
Hi everyone,
I''m currently working on a ''windows forms'' project that requires me to get some text from a website and display it within the program.
I have my public sub below, which basically reads the source code of the site in question, converts it to a string from a byte, and displays it in a multi-line textbox on the form.
Public Sub LoadSiteContent(ByVal url As String)
Dim client As New WebClient
Dim html As Byte() = client.DownloadData(url)
Dim webString As String = System.Text.Encoding.UTF8.GetString(html)
TextBox1.Text = webString
End Sub
此子获取全部源代码,而我只想在网站上有一个特定的段落,那么有没有办法缩小我在页面源中转换为该段落的字符串?也许通过使用正则表达式或子串?
我的班级也有这个导入:
This sub gets all the source code, whereas I only want a specific paragraph on the site, so is there a way to scale down the string I converted to just that paragraph within the page source? Maybe through using regular expressions or substrings?
I also have this import at the top of my class:
Imports System.Net
非常感谢任何回复,谢谢。
Any response is greatly appreciated, thanks.
推荐答案
这可能是你认为你想要的更多,但是......
这个过程被称为网络抓取,有一篇很好的文章关于它在这里:使用正则表达式在ASP.NET中进行Web Scraping匹配和XML转换 [ ^ ] - 它在C#中,但代码很容易翻译,描述非常清晰。
This is probably a bit more that you think you wanted, but...
The process is called "Web scraping", and there is a nice article about it here: Web Scraping in ASP.NET with Regular Expression Matching and XML Transformation[^] - it''s in C#, but the code is easily translatable, and the description is very clear.
如果有网页上该段之前和之后的固定文本(例如id =''xyz''的标签你可以在返回的字符串中找到它,然后从中获取所需的段落。我已经在我的一个应用程序中完成了这个,希望它对你也有帮助。如果有帮助,请标记为答案。
If there is a fixed text before and after that paragraph on the webpage (ex- some tag with id=''xyz'') you can find it in the string returned and then get the required paragraph from that. I have done this in one of my application hope it will be helpful for u also. please mark as answer if helped.
这篇关于将大字符串缩小为特定段落的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!