使用NSInputStream流式传输NSXMLParser [英] Streaming NSXMLParser with NSInputStream

查看:139
本文介绍了使用NSInputStream流式传输NSXMLParser的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新:

使用 NSXMLParser 类方法 initWithContentsOfURL ,而不是在下载XML提要时解析,它似乎尝试将整个XML文件加载到内存中,然后才启动解析过程。如果XML提供量很大(使用过多的RAM,本身效率低下,因为它不是与下载并行解析,而是只在下载完成后才开始解析等),这是有问题的。

When using NSXMLParser class method initWithContentsOfURL, rather than parsing as the XML feed is downloaded, it appears to try to load the entire XML file into memory, and only then initiate the parsing process. This is problematic if the XML feed is large (using an excessive amount of RAM, inherently inefficient because rather than parsing in parallel with the download, it only starts the parsing once the download is done, etc.).

有没有人发现如何使用 NSXMLParser 将Feed流式传输到设备?是的,您可以使用 LibXML2 (如下所述),但似乎应该可以使用 NSXMLParser 。但它让我不知所措。

Has anyone discovered how to parse as the feed is being streamed to the device using NSXMLParser? Yes, you can use LibXML2 (as discussed below), but it seems like it should be possible to do it with NSXMLParser. But it's eluding me.

原始问题:

我正在努力使用 NSXMLParser 从Web流中读取XML。如果您使用 initWithContentsOfURL ,而界面可能会导致人们推断它会从网络流式传输XML,它不会似乎这样做,但似乎在尝试解析之前首先尝试加载整个XML文件。对于适度大小的XML文件,这很好,但对于非常大的XML文件,这是有问题的。

I was wrestling with using NSXMLParser to read XML from a web stream. If you use initWithContentsOfURL, while the interface may lead one to infer that it would stream the XML from the web, it doesn't seem to to do so, but rather appears to attempt to load the entire XML file first before any parsing taking place. For modest sized XML files that's fine, but for really large ones, that's problematic.

我见过使用 NSXMLParser initWithStream ,其中包含一些自定义的 NSInputStream 来自网络。例如,有一些答案建议使用 CFStreamCreateBoundPair 之类的内容。 cocoa / 306924-asynchronous-downloads-and-parsing-of-xml.htmlrel =nofollow>关于Cocoa Builder的帖子和讨论在Apple >流编程指南,但我还没有开始工作。我甚至尝试编写自己的<$ h $ => http://Inveloper.apple.com/library/ios/#documentation/Cocoa/Reference/基础/类/ NSURLConnection_Class / Reference / Reference.htmlrel =nofollow> NSURLConnection (这本身就非常擅长流式传输)但我不是能够让它与 NSXMLParser 一起使用。

I have seen discussions of using NSXMLParser in conjunction with initWithStream with some customized NSInputStream that is streaming from the web. For example, there have been answers to this that suggest using something like the CFStreamCreateBoundPair referred to in the following Cocoa Builder post and the discussion of Setting Up Socket Streams in the Apple Stream Programming Guide, but I have not gotten it to work. I even tried writing my own subclassed NSInputStream that used a NSURLConnection (which is, itself, pretty good at streaming) but I wasn't able to get it to work in conjunction with NSXMLParser.

最后,我决定使用< a href =http://www.xmlsoft.org/ =nofollow> LibXML2 而不是 NSXMLParser ,如Apple XMLPerformance sample ,但我想知道是否有人从使用 NSXMLParser 的网络源获取流媒体。我已经看到很多理论上你可以做 x 的答案,建议从 CFStreamCreateBoundPair 到抓取 <来自 HTTPBodyStream 参考/ Reference.htmlrel =nofollow> NSURLRequest ,但我还没有看到使用 NSXMLParser 。

In the end, I decided to use LibXML2 rather than NSXMLParser, as demonstrated in the Apple XMLPerformance sample, but I was wondering if anyone had any luck getting streaming from a web source working with NSXMLParser. I've seen plenty of "theoretically you could do x" sort of answers, suggesting everything from CFStreamCreateBoundPair to grabbing the HTTPBodyStream from NSURLRequest, but I've yet to come across a working demonstration of streaming with NSXMLParser.

Ray Wenderlich文章如何为你的iPhone项目选择最佳XML解析器似乎确认 NSXMLParser 不适合大型XML文件,但是所有关于可能的 NSXMLParser 的帖子都是关于流式传输真正大型XML文件的所有帖子,我很惊讶我还没有找到这方面的工作演示。有没有人知道从网络流出的功能 NSXMLParser 实现?很明显,我可以坚持使用 LibXML2 或其他一些等效的XML解析器,但是使用 NSXMLParser 进行流式处理的概念似乎很明显。关闭。

The Ray Wenderlich article How To Choose The Best XML Parser for Your iPhone Project seems to confirm that NSXMLParser is not well suited for large XML files, but with all of the posts about possible NSXMLParser-based work-arounds for streaming really large XML files, I'm surprised I have yet to find a working demonstration of this. Does anyone know of a functioning NSXMLParser implementation that streams from the web? Clearly, I can just stick with LibXML2 or some other equivalent XML parser, but the notion of streaming with NSXMLParser seems tantilizingly close.

推荐答案

- [NSXMLParser initWithStream:] 是唯一的界面到目前执行数据流式解析的 NSXMLParser 。将其连接到以递增方式提供数据的异步 NSURLConnection 是不实用的,因为 NSXMLParser 采用阻塞,拉为基础从 NSInputStream 读取的方法。也就是说, - [NSXMLParser parse] 在处理 NSInputStream 时会执行以下操作:

-[NSXMLParser initWithStream:] is the only interface to NSXMLParser that currently performs a streaming parse of the data. Hooking it up to an asynchronous NSURLConnection that's providing data incrementally is unwieldy because NSXMLParser takes a blocking, "pull"-based approach to reading from the NSInputStream. That is, -[NSXMLParser parse] does something like the following when dealing with an NSInputStream:

while (1) {
    NSInteger length = [stream read:buffer maxLength:maxLength];
    if (!length)
        break;

    // Parse data …
}

为了逐步向此解析器提供数据需要一个自定义 NSInputStream 子类,用于汇总由后台队列上的 NSURLConnectionDelegate 调用接收的数据或者runloop到 -read:maxLength:调用 NSXMLParser 等待。

In order to incrementally provide data to this parser a custom NSInputStream subclass is needed that funnels data received by the NSURLConnectionDelegate calls on a background queue or runloop over to the -read:maxLength: call that NSXMLParser is waiting on.

概念验证实现如下:

#include <Foundation/Foundation.h>

@interface ReceivedDataStream : NSInputStream <NSURLConnectionDelegate>
@property (retain) NSURLConnection *connection;
@property (retain) NSMutableArray *bufferedData;
@property (assign, getter=isFinished) BOOL finished;
@property (retain) dispatch_semaphore_t semaphore;
@end

@implementation ReceivedDataStream

- (id)initWithContentsOfURL:(NSURL *)url
{
    if (!(self = [super init]))
        return nil;

    NSURLRequest *request = [NSURLRequest requestWithURL:url];
    self.connection = [[[NSURLConnection alloc] initWithRequest:request delegate:self startImmediately:NO] autorelease];
    self.connection.delegateQueue = [[[NSOperationQueue alloc] init] autorelease];
    self.bufferedData = [NSMutableArray array];
    self.semaphore = dispatch_semaphore_create(0);

    return self;
}

- (void)dealloc
{
    self.connection = nil;
    self.bufferedData = nil;
    self.semaphore = nil;

    [super dealloc];
}

- (BOOL)hasBufferedData
{
    @synchronized (self) { return self.bufferedData.count > 0; }
}

#pragma mark - NSInputStream overrides

- (void)open
{
    NSLog(@"open");
    [self.connection start];
}

- (void)close
{
    NSLog(@"close");
    [self.connection cancel];
}

- (NSInteger)read:(uint8_t *)buffer maxLength:(NSUInteger)maxLength
{
    NSLog(@"read:%p maxLength:%ld", buffer, maxLength);
    if (self.isFinished && !self.hasBufferedData)
        return 0;

    if (!self.hasBufferedData)
        dispatch_semaphore_wait(self.semaphore, DISPATCH_TIME_FOREVER);

    NSAssert(self.isFinished || self.hasBufferedData, @"Was woken without new information");

    if (self.isFinished && !self.hasBufferedData)
        return 0;

    NSData *data = nil;
    @synchronized (self) {
        data = [[self.bufferedData[0] retain] autorelease];
        [self.bufferedData removeObjectAtIndex:0];
        if (data.length > maxLength) {
            NSData *remainingData = [NSData dataWithBytes:data.bytes + maxLength length:data.length - maxLength];
            [self.bufferedData insertObject:remainingData atIndex:0];
        }
    }

    NSUInteger copiedLength = MIN([data length], maxLength);
    memcpy(buffer, [data bytes], copiedLength);
    return copiedLength;
}


#pragma mark - NSURLConnetionDelegate methods

- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data
{
    NSLog(@"connection:%@ didReceiveData:…", connection);
    @synchronized (self) {
        [self.bufferedData addObject:data];
    }
    dispatch_semaphore_signal(self.semaphore);
}

- (void)connectionDidFinishLoading:(NSURLConnection *)connection
{
    NSLog(@"connectionDidFinishLoading:%@", connection);
    self.finished = YES;
    dispatch_semaphore_signal(self.semaphore);
}

@end

@interface ParserDelegate : NSObject <NSXMLParserDelegate>
@end

@implementation ParserDelegate

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict
{
    NSLog(@"parser:%@ didStartElement:%@ namespaceURI:%@ qualifiedName:%@ attributes:%@", parser, elementName, namespaceURI, qualifiedName, attributeDict);
}

- (void)parserDidEndDocument:(NSXMLParser *)parser
{
    NSLog(@"parserDidEndDocument:%@", parser);
    CFRunLoopStop(CFRunLoopGetCurrent());
}

@end


int main(int argc, char **argv)
{
    @autoreleasepool {

        NSURL *url = [NSURL URLWithString:@"http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xml"];
        ReceivedDataStream *stream = [[ReceivedDataStream alloc] initWithContentsOfURL:url];
        NSXMLParser *parser = [[NSXMLParser alloc] initWithStream:stream];
        parser.delegate = [[[ParserDelegate alloc] init] autorelease];

        [parser performSelector:@selector(parse) withObject:nil afterDelay:0.0];

        CFRunLoopRun();

    }
    return 0;
}

这篇关于使用NSInputStream流式传输NSXMLParser的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆