从网站中提取详细信息 [英] extract details from a web site
问题描述
大家好,
我是asp.net应用程序的新手。实际上我想从网站上抓一些细节。我可以通过MS Excel中的vba来做到这一点。但是,不幸的是我的Internet Explorer浏览器工作不正常。
因此,我决定通过asp.net web应用程序进行网络抓取。
我想从以下网站的html代码中删除详细信息。
Hi to all,
I am new in asp.net applications. Actually I want to scrape some details from a web site.I can do this by vba in MS Excel. But, unfortunately my Internet Explorer browser is not working properly.
Hence, I have decided to do web scraping by asp.net web application.
I want to scrap details from following html code of a website.
<div class="phone-number">
(310) 703-7939
</div>
在这里,我想获得电话号码,即(310)703- 7939
我用以下代码来获取此信息。
Here, i want to get the phone number given i.e. (310) 703-7939
I have used following code to get this.
protected void btnInnerText_Click(object sender, EventArgs e)
{
var document = new HtmlDocument();
document.OptionReadEncoding = false;
var url =
new Uri("https://weedmaps.com/dispensaries/california/lax/true-healing-center?c=dispensaries");
var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
using (var response = (HttpWebResponse)request.GetResponse())
{
using (var stream = response.GetResponseStream())
{
document.Load(stream, Encoding.GetEncoding("iso-8859-9"));
}
}
var node = document.DocumentNode.SelectSingleNode("//div[@class='phone-number']");
{
if (node != null)
TextBox1.Text = node.InnerHtml;
else
TextBox1.Text = "na";
}
}
但它在Textbox1中给出了na而不是必需的电话号码。
亲切,帮我找到正确的结果。
注意: - 我使用过Htmlagilitypack。
提前致谢。
but it gives na in Textbox1 instead of required phone number.
kindly, help me to get correct result.
Note:- I have used Htmlagilitypack.
Thanks in advance.
推荐答案
你好,
简单的解决方案,人们并没有真正告诉你。
以下是执行网络请求的代码:
Hello,
Simple solution that people don't really tell you.
Here is the code to do a web request:
public static class WebConnection
{
public static string GetResponce(string url)
{
string data = "";
try
{
WebRequest request = WebRequest.Create(url);
request.Proxy = null;
request.Credentials = CredentialCache.DefaultCredentials;
WebResponse response = request.GetResponse();
Console.WriteLine(((HttpWebResponse)response).StatusDescription);
Stream dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
data = reader.ReadToEnd();
reader.Close();
response.Close();
}
catch
{ }
return data;
}
}
它就像一个魅力。
那么你所做的就是把它叫做:
And it works like a charm.
Then what you do is call it:
string responce = WebConnection.GetResponce("myuri");
//tip here
int i = 0;//This is a breakpoint trick
因此在int i = 0上创建一个断点;运行应用程序。
应用程序将中断,将鼠标悬停在变量响应上并单击放大镜,将任何文本复制到记事本文本文档中。
现在使用字符串操作,将请求的字符串操作为所需的字符串。
祝你好运!
So create a breakpoint on the int i = 0; and run the app.
The app will break, hover over the variable responce and click the magnifying glass, copy any text into a notepad text document.
Now using string manipulation, manipulate the requested string into the required string.
Good luck!
这篇关于从网站中提取详细信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!