获取任何网站的medata信息 [英] Get medata information of any website

查看:60
本文介绍了获取任何网站的medata信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用网站域名"作为输入来获取任何网站的元信息

前任.在文本框中输入的是"www.mywebsite.com"
然后点击按钮我想要元信息

我怎么得到它
帮助....

提前感谢

i want to fetch meta information of any website using Website Domain name as a input

ex. input in textbox is "www.mywebsite.com"
then on clicking on button i want meta information

how can i get it
help....

advance thanks

推荐答案

查看此处: C#站:获取带有HTTP的网页.
您可以将结果文本解析为xml,但是我建议使用正则表达式提取所需的内容.
Look here: C# Station: Fetching Web Pages with HTTP.
You can parse the resulted text as xml, but i suggest using regular expressions to extract what you need.


我认为元数据一词是指Headers信息,如果这样的话,下面的代码将有所帮助您开始,

I assume the term metadata refers to the Headers information, if so then following code will help you to start,

namespace ConsoleApplication24
{
    using System;
    using System.Net;
    class Program
    {
        static void Main(string[] args)
        {
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.Codeproject.com");
            request.Method = "GET";
            IWebProxy proxy = WebRequest.GetSystemWebProxy();
            proxy.Credentials = CredentialCache.DefaultCredentials;
            request.Proxy = proxy;

            using (WebResponse response = request.GetResponse())
            {
                WebHeaderCollection collection = response.Headers;

                Array.ForEach(collection.AllKeys,
                    key =>
                    {
                        Console.WriteLine("{0,20}:{1}", key, string.Concat(collection.GetValues(key)));
                    });
            }
        }
    }
}



该代码将产生以下输出,




This code will produce following output,


    Proxy-Connection:Keep-Alive
          Connection:Keep-Alive
      Content-Length:94210
       Cache-Control:private
        Content-Type:text/html; charset=utf-8
                Date:Sun, 06 May 2012 03:40:03 GMT
          Set-Cookie:SessionGUID=f20a2597-70c8-4a1a-97b9-718fc0c108d9; path=/mgu
id=39d8dd1e-4189-44d6-9738-9cb90c7dade9; domain=.codeproject.com; expires=Tue05-
May-2037 04:00:00 GMT; path=/SessionGUID=f20a2597-70c8-4a1a-97b9-718fc0c108d9; p
ath=/mguid=39d8dd1e-4189-44d6-9738-9cb90c7dade9; domain=.codeproject.com; expire
s=Tue05-May-2037 04:00:00 GMT; path=/
                 Age:2
Press any key to continue . . .



或者,如果我们要获取元标记,请看一下,

C#解析元标记 [



Or if we want to get meta tag please have a look,

C# Parse Meta Tags[^]

Hope it helps a bit :)


这篇关于获取任何网站的medata信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆