仅从 <div> 获取文本和图像在 Objective-C 中 [英] Get only text and images from &lt;div&gt; in Objective-C

查看:57
本文介绍了仅从 <div> 获取文本和图像在 Objective-C 中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在制作一个新闻阅读应用程序.我发现的最好的网站是 http://fulltextrssfeed.com/它从任何网页中获取文本和图像并返回干净的文本.由于他们没有 API,我需要某种方式从

获取数据.这是 div ID:

我怎样才能浸入到提要上并只获取其内容(如果没有 HTML 标签,那将是一个加分项,如果有我可以解决.)

解决方案

我不确定你的问题,但如果你使用的是 obj-c,我真的推荐 Hpple.这是一个非常好的 XML/HTML 解析器.

要使用它,您需要在您的项目选项的标题搜索路径"中添加${SDKROOT}/usr/include/libxml2并添加-lxml2 到其他链接器标志".

然后,当您已经有了 Hpple 文件时,将其拖到您的代码中:TFHpple.hTFHpple.mTFHppleElement.hTFHppleElement.mXPathQuery.hXPathQuery.m.

在代码中(为了让你的 div预览"),添加:

NSData *htmlData = [[NSString stringWithContentsOfURL:[NSURL URLWithString: @"http://www.yoursite.com/index.html"]] dataUsingEncoding:NSUTF8StringEncoding];TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:htmlData];NSArray *elements = [xpathParser searchWithXPathQuery:@"//div[@id='preview']"];//这里我们使用TFHppleElement *element = [元素 objectAtIndex:0];NSString *string = [元素内容];NSLog(@"%@", string);[xpathParser 发布];[html数据发布];

现在我们有了 Hpple 的预览 div".要获得一些子类(如 pa),请使用它:

NSArray *elements = [xpathParser searchWithXPathQuery:@"//div[@id='preview']/p/text()"];

要了解更多信息,请查看 XPath 语法.另请查看教程.

希望对您有所帮助.

I'm making a news reading application. The best site I found was http://fulltextrssfeed.com/ It takes the text and images from any webpage and gives back clean text. As they don't have an API I need some way to get the data from the <div>. This is the div ID:

<div id="preview">

How can I leach onto the feed and get only its content (It would be a plus if there are no HTML tags, if there is I can make a work around.)

解决方案

I'm not sure about your question, but if you're using obj-c, I really recommend Hpple. It's a really good XML/HTML parser.

To use it, you'll need to add ${SDKROOT}/usr/include/libxml2 in "Header Search Path", in your project option and add -lxml2 to "Other Linker Flag".

Then, when you already have the Hpple files, drag it to your code: TFHpple.h, TFHpple.m, TFHppleElement.h, TFHppleElement.m, XPathQuery.h, XPathQuery.m.

In the code (To get your div "preview"), add:

NSData *htmlData = [[NSString stringWithContentsOfURL:[NSURL URLWithString: @"http://www.yoursite.com/index.html"]] dataUsingEncoding:NSUTF8StringEncoding];

TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:htmlData];
NSArray *elements  = [xpathParser searchWithXPathQuery:@"//div[@id='preview']"]; // Here we use 
TFHppleElement *element = [elements objectAtIndex:0];
NSString *string = [element content];
NSLog(@"%@", string);

[xpathParser release];
[htmlData release];

Now we have the "preview div" with Hpple. To get some subclass (as p or a), use it:

NSArray *elements  = [xpathParser searchWithXPathQuery:@"//div[@id='preview']/p/text()"]; 

To undertand more, take a look at XPath Syntax. Also check a tutorial.

Hope it help.

这篇关于仅从 <div> 获取文本和图像在 Objective-C 中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆