在iPhone上将整个pdf页面解析为NSString [英] Parse whole pdf-page to NSString on an iPhone

查看：187 发布时间：2018/11/2 13:39:56 ios iphone xcode parsing pdf

本文介绍了在iPhone上将整个pdf页面解析为NSString的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在尝试将pdf页面的文本解析为NSString一段时间了，我唯一能找到的是搜索特定字符串值的方法。

I've been trying to parse a pdf-page of text to NSString for a while now and the only thing I can find are methods to search for specific stringvalues.

我想做的是解析单页PDF而不使用任何外部库，如PDFKitten，PDFKit等。

What I'd like to do is parse a single page of PDF without using any external libraries such as PDFKitten, PDFKit etc.

我想要如果可能的话，NSArray，NSString或NSDictionary中的数据。

I'd like to have the data in an NSArray, NSString or NSDictionary if possible.

谢谢：D！

我的一部分到目前为止已经尝试过了。

A piece of what I've tried so far.

CGPDFDocumentRef MyGetPDFDocumentRef (const char *filename) {
    CFStringRef path;
    CFURLRef url;
    CGPDFDocumentRef document;
    path = CFStringCreateWithCString (NULL, filename,kCFStringEncodingUTF8);
    url = CFURLCreateWithFileSystemPath (NULL, path, kCFURLPOSIXPathStyle, 0);
    CFRelease (path);
    document = CGPDFDocumentCreateWithURL (url);// 2
    CFRelease(url);
    int count = CGPDFDocumentGetNumberOfPages (document);// 3
    if (count == 0) {
        printf("`%s' needs at least one page!", filename);
        return NULL;
    }
    return document;
}

// table methods to parse pdf
static void op_MP (CGPDFScannerRef s, void *info) {
    const char *name;
    if (!CGPDFScannerPopName(s, &name))
        return;
    printf("MP /%s\n", name);
}

static void op_DP (CGPDFScannerRef s, void *info) {
    const char *name;
    if (!CGPDFScannerPopName(s, &name))
        return;
    printf("DP /%s\n", name);
}

static void op_BMC (CGPDFScannerRef s, void *info) {
    const char *name;
    if (!CGPDFScannerPopName(s, &name))
        return;
    printf("BMC /%s\n", name);
}

static void op_BDC (CGPDFScannerRef s, void *info) {
    const char *name;
    if (!CGPDFScannerPopName(s, &name))
        return;
    printf("BDC /%s\n", name);
}

static void op_EMC (CGPDFScannerRef s, void *info) {
    const char *name;
    if (!CGPDFScannerPopName(s, &name))
        return;
    printf("EMC /%s\n", name);
}

void MyDisplayPDFPage (CGContextRef myContext,size_t pageNumber,const char *filename) {
    CGPDFDocumentRef document;
    CGPDFPageRef page;
    document = MyGetPDFDocumentRef (filename);// 1
    totalPages=CGPDFDocumentGetNumberOfPages(document);
    page = CGPDFDocumentGetPage (document, 1);// 2

    CGPDFDictionaryRef d;

    d = CGPDFPageGetDictionary(page);

    CGPDFScannerRef myScanner;
    CGPDFOperatorTableRef myTable;
    myTable = CGPDFOperatorTableCreate();
    CGPDFOperatorTableSetCallback (myTable, "MP", &op_MP);
    CGPDFOperatorTableSetCallback (myTable, "DP", &op_DP);
    CGPDFOperatorTableSetCallback (myTable, "BMC", &op_BMC);
    CGPDFOperatorTableSetCallback (myTable, "BDC", &op_BDC);
    CGPDFOperatorTableSetCallback (myTable, "EMC", &op_EMC);

    CGPDFContentStreamRef myContentStream = CGPDFContentStreamCreateWithPage (page);// 3
    myScanner = CGPDFScannerCreate (myContentStream, myTable, NULL);// 4

    CGPDFScannerScan (myScanner);// 5

    CGPDFStringRef str;

    d = CGPDFPageGetDictionary(page);

    if (CGPDFDictionaryGetString(d, "Lorem", &str)){
        CFStringRef s;
        s = CGPDFStringCopyTextString(str);
        if (s != NULL) {
            NSLog(@"%@ testing it", s);
        }
        CFRelease(s);
    }
}

- (void)viewDidLoad {
    [super viewDidLoad];


    MyDisplayPDFPage(UIGraphicsGetCurrentContext(), 1, [[[NSBundle mainBundle] pathForResource:@"TestPage" ofType:@"pdf"] UTF8String]);

}

在iPhone上将整个pdf页面解析为NSString [英] Parse whole pdf-page to NSString on an iPhone

问题描述

推荐答案

相关文章

移动开发最新文章

热门教程

热门工具

登录关闭

在iPhone上将整个pdf页面解析为NSString [英] Parse whole pdf-page to NSString on an iPhone

问题描述

推荐答案

相关文章

移动开发最新文章

热门教程

热门工具

登录 关闭

登录关闭