获取网页数据+ C＃ [英] Get webpage data + C#

查看：60 发布时间：2019/6/12 1:28:23 C#

本文介绍了获取网页数据+ C＃的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

大家好！

我正试图从网页上获取数据，但不幸的是我无法做到这一点！我已经尝试了2个小时而且我不能这样做...

我不想获取HTML数据，因为我已经看到了所有示例描述了这种能力。

您是否知道如何从网页获取纯文本，例如来自http://www.onet.pl，我会喜欢接受例如：wiadomości，biznes，sport和更多纯文本。我对html不感兴趣！

我想做一些像ctrl + a（标记所有页面）并复制到我的程序并从中获取纯复制文本网页??

请帮助我！

祝你好运

好的，谢谢，你能告诉我如何在网页中以编程方式选择CTRL + A fox示例并将其复制到C＃语言的剪贴板中？

Hello all !

I'm trying to get data from webpage but unfortunately I'm not able to do this !!! I've been trying for 2 hours and I can't do it...

I don't want to get html data, owing to I have seen all examples describes that ability.

Do You have any idea how to get only pure text from webpage such like from http://www.onet.pl, and I would like to receive for instance : "wiadomości, biznes, sport" and many more pure text. I'm not interested in html !

I would like to do something like ctrl+a ( mark all page ) and copy to my program and get pure copied text from webpage ??

Please, help me !

Best regards

Ok thanks, could You tell me how would I programatically select CTRL+A fox example in webpage and copy this to clipboard in C# language ??

推荐答案

您可以使用类 System.Net.HttpWebRequest 和 System.Net.HttpWebResponse ，参见：

http://msdn.microsoft.com/en-us /library/system.net.webrequest.aspx [ ^ ]（这里有一些 HttpWebRequest 用法示例），

http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.aspx [ ^ ]，

http://msdn.microsoft.com/en-us/library/system.net.httpwebresponse.aspx [ ^ ]。

您可以查看我在CodeProject提供的应用程序HttpDownloader的完整代码，以获取完整的代码示例：如何从互联网上下载文件 [ ^ ]。

-SA

You can use the classes System.Net.HttpWebRequest and System.Net.HttpWebResponse, see:
http://msdn.microsoft.com/en-us/library/system.net.webrequest.aspx[^] (some HttpWebRequest usage sample here),
http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.aspx[^],
http://msdn.microsoft.com/en-us/library/system.net.httpwebresponse.aspx[^].

You can look at the complete code of my application HttpDownloader I provided here at CodeProject for complete code sample: how to download a file from internet[^].

—SA

网站ar用HTML编写。

如果你想要HTML中的文本你必须解析它，例如使用Html Agility Pack，它为每个节点提供一个InnerText属性，它只提取文本而不提供任何文本标记。

但请记住，布局也是标记 - 大多数网站的纯文字版本看起来不太好......

前面的解决方案显示了如何使用System获取HTML。

Websites are written in HTML.
If you want the text inside the HTML you have to parse it, for example with Html Agility Pack, which offers for each node a InnerText-property which extracts only the text without any markup.
But keep in mind that layout is also markup - the text-only versions of the most websites do not look very good...

The previous solution shows how you can obtain the HTML.

using System;
using System.IO;
using System.Net;
using System.Text;


/// <summary>
/// Fetches a Web Page
/// </summary>
class WebFetch
{
	static void Main(string[] args)
	{
		// used to build entire input
		StringBuilder sb  = new StringBuilder();

		// used on each read operation
		byte[]        buf = new byte[8192];

		// prepare the web page we will be asking for
		HttpWebRequest  request  = (HttpWebRequest)
			WebRequest.Create("http://www.mayosoftware.com");

		// execute the request
		HttpWebResponse response = (HttpWebResponse)
			request.GetResponse();

		// we will read data via the response stream
		Stream resStream = response.GetResponseStream();

		string tempString = null;
		int    count      = 0;

		do
		{
			// fill the buffer with data
			count = resStream.Read(buf, 0, buf.Length);

			// make sure we read some data
			if (count != 0)
			{
				// translate from bytes to ASCII text
				tempString = Encoding.ASCII.GetString(buf, 0, count);

				// continue building the string
				sb.Append(tempString);
			}
		}
		while (count > 0); // any more data to read?

		// print out page source
		Console.WriteLine(sb.ToString());
	}
}

这篇关于获取网页数据+ C＃的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

获取网页数据+ C＃ [英] Get webpage data + C#

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

获取网页数据+ C＃ [英] Get webpage data + C#

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭