只从< div>获取文字和图片在目标C中 [英] Get only text and images from <div> in Objective C
问题描述
我正在制作新闻阅读应用程序。我发现的最佳网站是 http://fulltextrssfeed.com/
它将文本和图像从任何网页,并返回干净的文本。由于他们没有API,我需要一些方法从< div>
获取数据。
这是div ID:
< div id =preview>
如何将 浸到Feed中并获取其内容如果没有HTML标签,那么这将是一个加号,如果有我可以做一个工作。)
不确定你的问题,但如果你使用obj-c,我真的推荐 Hpple 。这是一个很好的XML / HTML解析器。要使用它,您需要添加 $ {SDKROOT} / usr / include / libxml2在你的项目选项中的
,并将 -lxml2
添加到Other Linker Flag中。
$ b 然后,当你已经有Hpple文件时,将它拖到你的代码中:
TFHpple.h
, TFHpple.m
, TFHppleElement.h
, TFHppleElement.m
, XPathQuery.h
, XPathQuery.m
。 代码(为了让你的div预览),添加:
NSData * htmlData = [[NSString stringWithContentsOfURL:[NSURL URLWithString: @http://www.yoursite.com/index.html]] dataUsingEncoding:NSUTF8StringEncoding];
TFHpple * xpathParser = [[TFHpple alloc] initWithHTMLData:htmlData];
NSArray * elements = [xpathParser searchWithXPathQuery:@// div [@ id ='preview']]; //这里我们使用
TFHppleElement * element = [elements objectAtIndex:0];
NSString * string = [元素内容];
NSLog(@%@,string);
[xpathParser release];
[htmlData发布];
现在我们有了Hpple的预览div。要获得一些子类(如 p
或 a
),请使用它:
NSArray * elements = [xpathParser searchWithXPathQuery:@// div [@ id ='preview'] / p / text()];
希望它有帮助。
I'm making a news reading application. The best site I found was http://fulltextrssfeed.com/
It takes the text and images from any webpage and gives back clean text. As they don't have an API I need some way to get the data from the <div>
.
This is the div ID:
<div id="preview">
How can I leach onto the feed and get only its content (It would be a plus if there are no HTML tags, if there is I can make a work around.)
I'm not sure about your question, but if you're using obj-c, I really recommend Hpple. It's a really good XML/HTML parser.
To use it, you'll need to add ${SDKROOT}/usr/include/libxml2
in "Header Search Path", in your project option and add -lxml2
to "Other Linker Flag".
Then, when you already have the Hpple files, drag it to your code: TFHpple.h
, TFHpple.m
, TFHppleElement.h
, TFHppleElement.m
, XPathQuery.h
, XPathQuery.m
.
In the code (To get your div "preview"), add:
NSData *htmlData = [[NSString stringWithContentsOfURL:[NSURL URLWithString: @"http://www.yoursite.com/index.html"]] dataUsingEncoding:NSUTF8StringEncoding];
TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:htmlData];
NSArray *elements = [xpathParser searchWithXPathQuery:@"//div[@id='preview']"]; // Here we use
TFHppleElement *element = [elements objectAtIndex:0];
NSString *string = [element content];
NSLog(@"%@", string);
[xpathParser release];
[htmlData release];
Now we have the "preview div" with Hpple. To get some subclass (as p
or a
), use it:
NSArray *elements = [xpathParser searchWithXPathQuery:@"//div[@id='preview']/p/text()"];
To undertand more, take a look at XPath Syntax. Also check a tutorial.
Hope it help.
这篇关于只从< div>获取文字和图片在目标C中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!