从Google新闻RSS Feed获取描述 [英] Getting description from google news RSS Feed

查看:78
本文介绍了从Google新闻RSS Feed获取描述的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从Google新闻中检索RSS提要,并在WPF应用程序中显示标题和描述.对于BBC新闻,这很容易做到,因为描述很清楚,没有其他信息,因此我只能阅读childnode文本.但是Google新闻的描述使用了不同的格式,因此我不需要很多信息.这是RSS链接: http://news.google.co.uk/news?pz=1&cf=all&ned=uk&hl=zh-CN&topic=n&output=rss [

I need to retrieve the RSS feed from Google news and display the title and description in a WPF app. This was easy to do for BBC news as the description was clear with no other information, so I could just read the childnode text. However Google News uses a different format for it''s description, so there is a lot of information there I don''t need. This is the RSS link: http://news.google.co.uk/news?pz=1&cf=all&ned=uk&hl=en&topic=n&output=rss[^]

The description and image link are both in the same body of text. Can anyone tell me the best way to extract this text from the rest of the text. I have an idea it''s something to do with regular expressions but I never used that function before. Below is the code I use for reading the feed if it helps.

try
            {
                // load the xml file
                XmlDocument xmlDoc = new XmlDocument();
                XmlNode nodeRss = null;
                XmlNode nodeChannel = null;
                XmlNode nodeItem = null;
                XmlTextReader xmlReader = new XmlTextReader("http://news.google.co.uk/news?pz=1&cf=all&ned=uk&hl=en&topic=n&output=rss");
                xmlDoc.Load(xmlReader);e
                for (int i = 0; i < xmlDoc.ChildNodes.Count; i++)
                {
                    if (xmlDoc.ChildNodes[i].Name == "rss")
                    {

                        // <rss> tag found

                        nodeRss = xmlDoc.ChildNodes[i];

                    }
                }
                for (int i = 0; i < nodeRss.ChildNodes.Count; i++)
                {

                    // If it is the channel tag

                    if (nodeRss.ChildNodes[i].Name == "channel")
                    {

                        // <channel> tag found

                        nodeChannel = nodeRss.ChildNodes[i];
                    }
                }
                string title = null;
                string description = null;
                for (int j = 0; j < nodeChannel.ChildNodes.Count; j++)
                {
                    if (nodeChannel.ChildNodes[j].Name == "item")
                    {
                        nodeItem = nodeChannel.ChildNodes[j];
                        for (int i = 0; i < nodeItem.ChildNodes.Count; i++)
                        {
                            if (nodeItem.ChildNodes[i].Name == "title")
                            {
                                title = nodeItem.ChildNodes[i].InnerText.ToString();
                            }
                            if (nodeItem.ChildNodes[i].Name == "description")
                            {
                                description = nodeItem.ChildNodes[i].InnerText.ToString();
                            }
                        }
                        GoogleItems.Add(new GoogleFeed(title, description));
                    }
                }
                MainNews.Text = GoogleItems[0].Description;
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }



我发现的另一个新闻提要是
http://rss.msnbc.msn.com/id/3032506/device/rss/rss.xml [ ^ ]

如果看起来比上一个链接更简单,那么将欢迎您从此提要中获取有关描述和图像网址的任何建议.

感谢您的提前帮助.



Another news feed I found is http://rss.msnbc.msn.com/id/3032506/device/rss/rss.xml[^]

If looks simpler than the previous link, so any advice on getting the description and image url out of that feed will be welcome.

Thanks for your help in advance.

推荐答案

这似乎对我有用:

This seems to work for me:

static string Strip(string text)
{
    return Regex.Replace(text, @"<(.|\n)*?>", String.Empty);
}

static void Main()
{
    XmlTextReader xmlReader = new XmlTextReader("http://news.google.co.uk/news?pz=1&cf=all&ned=uk&hl=en&topic=n&output=rss");
    XmlDocument xmlDoc = new XmlDocument();
    xmlDoc.Load(xmlReader);

    XPathNavigator navigator = xmlDoc.CreateNavigator();

    string mainTitle = Strip(navigator.SelectSingleNode("rss/channel/image/title").Value);
    string mainUrl = Strip(navigator.SelectSingleNode("rss/channel/image/url").Value);
    string mainLink = Strip(navigator.SelectSingleNode("rss/channel/image/link").Value);

    XPathNodeIterator items = navigator.Select("rss/channel/item");
    while (items.MoveNext())
    {
        XPathNavigator item = items.Current;
        string title = Strip(item.SelectSingleNode("title").Value);
        string category = Strip(item.SelectSingleNode("category").Value);
        string description = Strip(item.SelectSingleNode("description").Value);
    }      
}





希望这会有所帮助,
弗雷德里克·博纳德(Fredrik Bornander)





Hope this helps,
Fredrik Bornander


这篇关于从Google新闻RSS Feed获取描述的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆