使用NSInputStream流式传输NSXMLParser [英] Streaming NSXMLParser with NSInputStream
问题描述
更新:
使用 NSXMLParser
类方法 initWithContentsOfURL
,而不是在下载XML提要时解析,它似乎尝试将整个XML文件加载到内存中,然后才启动解析过程。如果XML提供量很大(使用过多的RAM,本身效率低下,因为它不是与下载并行解析,而是只在下载完成后才开始解析等),这是有问题的。
When using NSXMLParser
class method initWithContentsOfURL
, rather than parsing as the XML feed is downloaded, it appears to try to load the entire XML file into memory, and only then initiate the parsing process. This is problematic if the XML feed is large (using an excessive amount of RAM, inherently inefficient because rather than parsing in parallel with the download, it only starts the parsing once the download is done, etc.).
有没有人发现如何使用 NSXMLParser
将Feed流式传输到设备?是的,您可以使用 LibXML2
(如下所述),但似乎应该可以使用 NSXMLParser
。但它让我不知所措。
Has anyone discovered how to parse as the feed is being streamed to the device using NSXMLParser
? Yes, you can use LibXML2
(as discussed below), but it seems like it should be possible to do it with NSXMLParser
. But it's eluding me.
原始问题:
我正在努力使用 NSXMLParser
从Web流中读取XML。如果您使用 initWithContentsOfURL
,而界面可能会导致人们推断它会从网络流式传输XML,它不会似乎这样做,但似乎在尝试解析之前首先尝试加载整个XML文件。对于适度大小的XML文件,这很好,但对于非常大的XML文件,这是有问题的。
I was wrestling with using NSXMLParser
to read XML from a web stream. If you use initWithContentsOfURL
, while the interface may lead one to infer that it would stream the XML from the web, it doesn't seem to to do so, but rather appears to attempt to load the entire XML file first before any parsing taking place. For modest sized XML files that's fine, but for really large ones, that's problematic.
我见过使用 NSXMLParser $ c $的讨论c>与
之类的内容。 cocoa / 306924-asynchronous-downloads-and-parsing-of-xml.htmlrel =nofollow>关于Cocoa Builder的帖子和讨论在Apple >流编程指南,但我还没有开始工作。我甚至尝试编写自己的<$ h $ => http://Inveloper.apple.com/library/ios/#documentation/Cocoa/Reference/基础/类/ NSURLConnection_Class / Reference / Reference.htmlrel =nofollow> initWithStream
,其中包含一些自定义的 NSInputStream
来自网络。例如,有一些答案建议使用 CFStreamCreateBoundPair NSURLConnection
(这本身就非常擅长流式传输)但我不是能够让它与 NSXMLParser
一起使用。
I have seen discussions of using NSXMLParser
in conjunction with initWithStream
with some customized NSInputStream
that is streaming from the web. For example, there have been answers to this that suggest using something like the CFStreamCreateBoundPair
referred to in the following Cocoa Builder post and the discussion of Setting Up Socket Streams in the Apple Stream Programming Guide, but I have not gotten it to work. I even tried writing my own subclassed NSInputStream
that used a NSURLConnection
(which is, itself, pretty good at streaming) but I wasn't able to get it to work in conjunction with NSXMLParser
.
最后,我决定使用< a href =http://www.xmlsoft.org/ =nofollow> LibXML2
而不是 NSXMLParser
,如Apple XMLPerformance sample ,但我想知道是否有人从使用 NSXMLParser
的网络源获取流媒体。我已经看到很多理论上你可以做 x 的答案,建议从 CFStreamCreateBoundPair
到抓取 <来自 HTTPBodyStream 参考/ Reference.htmlrel =nofollow> NSURLRequest
,但我还没有看到使用 NSXMLParser
。
In the end, I decided to use LibXML2
rather than NSXMLParser
, as demonstrated in the Apple XMLPerformance sample, but I was wondering if anyone had any luck getting streaming from a web source working with NSXMLParser
. I've seen plenty of "theoretically you could do x" sort of answers, suggesting everything from CFStreamCreateBoundPair
to grabbing the HTTPBodyStream
from NSURLRequest
, but I've yet to come across a working demonstration of streaming with NSXMLParser
.
Ray Wenderlich文章如何为你的iPhone项目选择最佳XML解析器似乎确认 NSXMLParser
不适合大型XML文件,但是所有关于可能的 NSXMLParser
的帖子都是关于流式传输真正大型XML文件的所有帖子,我很惊讶我还没有找到这方面的工作演示。有没有人知道从网络流出的功能 NSXMLParser
实现?很明显,我可以坚持使用 LibXML2
或其他一些等效的XML解析器,但是使用 NSXMLParser
进行流式处理的概念似乎很明显。关闭。
The Ray Wenderlich article How To Choose The Best XML Parser for Your iPhone Project seems to confirm that NSXMLParser
is not well suited for large XML files, but with all of the posts about possible NSXMLParser
-based work-arounds for streaming really large XML files, I'm surprised I have yet to find a working demonstration of this. Does anyone know of a functioning NSXMLParser
implementation that streams from the web? Clearly, I can just stick with LibXML2
or some other equivalent XML parser, but the notion of streaming with NSXMLParser
seems tantilizingly close.
推荐答案
- [NSXMLParser initWithStream:]
是唯一的界面到目前执行数据流式解析的 NSXMLParser
。将其连接到以递增方式提供数据的异步 NSURLConnection
是不实用的,因为 NSXMLParser
采用阻塞,拉为基础从 NSInputStream
读取的方法。也就是说, - [NSXMLParser parse]
在处理 NSInputStream
时会执行以下操作:
-[NSXMLParser initWithStream:]
is the only interface to NSXMLParser
that currently performs a streaming parse of the data. Hooking it up to an asynchronous NSURLConnection
that's providing data incrementally is unwieldy because NSXMLParser
takes a blocking, "pull"-based approach to reading from the NSInputStream
. That is, -[NSXMLParser parse]
does something like the following when dealing with an NSInputStream
:
while (1) {
NSInteger length = [stream read:buffer maxLength:maxLength];
if (!length)
break;
// Parse data …
}
为了逐步向此解析器提供数据需要一个自定义 NSInputStream
子类,用于汇总由后台队列上的 NSURLConnectionDelegate
调用接收的数据或者runloop到 -read:maxLength:
调用 NSXMLParser
等待。
In order to incrementally provide data to this parser a custom NSInputStream
subclass is needed that funnels data received by the NSURLConnectionDelegate
calls on a background queue or runloop over to the -read:maxLength:
call that NSXMLParser
is waiting on.
概念验证实现如下:
#include <Foundation/Foundation.h>
@interface ReceivedDataStream : NSInputStream <NSURLConnectionDelegate>
@property (retain) NSURLConnection *connection;
@property (retain) NSMutableArray *bufferedData;
@property (assign, getter=isFinished) BOOL finished;
@property (retain) dispatch_semaphore_t semaphore;
@end
@implementation ReceivedDataStream
- (id)initWithContentsOfURL:(NSURL *)url
{
if (!(self = [super init]))
return nil;
NSURLRequest *request = [NSURLRequest requestWithURL:url];
self.connection = [[[NSURLConnection alloc] initWithRequest:request delegate:self startImmediately:NO] autorelease];
self.connection.delegateQueue = [[[NSOperationQueue alloc] init] autorelease];
self.bufferedData = [NSMutableArray array];
self.semaphore = dispatch_semaphore_create(0);
return self;
}
- (void)dealloc
{
self.connection = nil;
self.bufferedData = nil;
self.semaphore = nil;
[super dealloc];
}
- (BOOL)hasBufferedData
{
@synchronized (self) { return self.bufferedData.count > 0; }
}
#pragma mark - NSInputStream overrides
- (void)open
{
NSLog(@"open");
[self.connection start];
}
- (void)close
{
NSLog(@"close");
[self.connection cancel];
}
- (NSInteger)read:(uint8_t *)buffer maxLength:(NSUInteger)maxLength
{
NSLog(@"read:%p maxLength:%ld", buffer, maxLength);
if (self.isFinished && !self.hasBufferedData)
return 0;
if (!self.hasBufferedData)
dispatch_semaphore_wait(self.semaphore, DISPATCH_TIME_FOREVER);
NSAssert(self.isFinished || self.hasBufferedData, @"Was woken without new information");
if (self.isFinished && !self.hasBufferedData)
return 0;
NSData *data = nil;
@synchronized (self) {
data = [[self.bufferedData[0] retain] autorelease];
[self.bufferedData removeObjectAtIndex:0];
if (data.length > maxLength) {
NSData *remainingData = [NSData dataWithBytes:data.bytes + maxLength length:data.length - maxLength];
[self.bufferedData insertObject:remainingData atIndex:0];
}
}
NSUInteger copiedLength = MIN([data length], maxLength);
memcpy(buffer, [data bytes], copiedLength);
return copiedLength;
}
#pragma mark - NSURLConnetionDelegate methods
- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data
{
NSLog(@"connection:%@ didReceiveData:…", connection);
@synchronized (self) {
[self.bufferedData addObject:data];
}
dispatch_semaphore_signal(self.semaphore);
}
- (void)connectionDidFinishLoading:(NSURLConnection *)connection
{
NSLog(@"connectionDidFinishLoading:%@", connection);
self.finished = YES;
dispatch_semaphore_signal(self.semaphore);
}
@end
@interface ParserDelegate : NSObject <NSXMLParserDelegate>
@end
@implementation ParserDelegate
- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict
{
NSLog(@"parser:%@ didStartElement:%@ namespaceURI:%@ qualifiedName:%@ attributes:%@", parser, elementName, namespaceURI, qualifiedName, attributeDict);
}
- (void)parserDidEndDocument:(NSXMLParser *)parser
{
NSLog(@"parserDidEndDocument:%@", parser);
CFRunLoopStop(CFRunLoopGetCurrent());
}
@end
int main(int argc, char **argv)
{
@autoreleasepool {
NSURL *url = [NSURL URLWithString:@"http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xml"];
ReceivedDataStream *stream = [[ReceivedDataStream alloc] initWithContentsOfURL:url];
NSXMLParser *parser = [[NSXMLParser alloc] initWithStream:stream];
parser.delegate = [[[ParserDelegate alloc] init] autorelease];
[parser performSelector:@selector(parse) withObject:nil afterDelay:0.0];
CFRunLoopRun();
}
return 0;
}
这篇关于使用NSInputStream流式传输NSXMLParser的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!