如何从C#网站标题 [英] How to get website title from c#

查看:83
本文介绍了如何从C#网站标题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我重新审视我的索姆旧代码并在方法获取基于其URL的网页的标题都有所涉猎。这不是真的,你会称之为一个稳定的方法,因为它往往未能产生结果,有时甚至会产生不正确的结果。此外,有时它未能显示出一些从标题的字符,因为它们的替代的编码。

I'm revisiting som old code of mine and have stumbled upon a method for getting the title of a website based on its url. It's not really what you would call a stable method as it often fails to produce a result and sometimes even produces incorrect results. Also, sometimes it fails to show some of the characters from the title as they are of an alternative encoding.

有没有人有过旧版本的改进建议?

Does anyone have suggestions for improvements over this old version?

public static string SuggestTitle(string url, int timeout)
{
    WebResponse response = null;
    string line = string.Empty;

    try
    {
        WebRequest request = WebRequest.Create(url);
        request.Timeout = timeout;

        response = request.GetResponse();
        Stream streamReceive = response.GetResponseStream();
        Encoding encoding = System.Text.Encoding.GetEncoding("utf-8");
        StreamReader streamRead = new System.IO.StreamReader(streamReceive, encoding);

        while(streamRead.EndOfStream != true)
        {
            line = streamRead.ReadLine();
            if (line.Contains("<title>"))
            {
                line = line.Split(new char[] { '<', '>' })[2];
                break;
            }
        }
    }
    catch (Exception) { }
    finally
    {
        if (response != null)
        {
            response.Close();
        }
    }

    return line;
}



最后要注意的 - 我想代码的运行速度为好,如它阻止,直到页面被提取,因此,如果我能得到的只有该网站标题,而不是整个页面,这将是巨大的。

One final note - I would like the code to run faster as well, as it is blocking until the page as been fetched, so if I can get only the site header and not the entire page, it would be great.

推荐答案

一个简单的方法来获得内容:

A simpler way to get the content:

WebClient x = new WebClient();
string source = x.DownloadString("http://www.singingeels.com/");

有一个更简单,更可靠的方式获得标题:

A simpler, more reliable way to get the title:

string title = Regex.Match(source, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>", RegexOptions.IgnoreCase).Groups["Title"].Value;

这篇关于如何从C#网站标题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆