pdf2json给我一个空白的输出txt文件? [英] pdf2json gives me a blank output txt file?

查看:209
本文介绍了pdf2json给我一个空白的输出txt文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在其github上遵循其代码示例"指南. https://github.com/modesty/pdf2json#code-example

在示例解析PDF然后编写一个.txt文件(仅包含PDF的文本内容)"的示例中,我将确切的实现复制并粘贴到了本地JavaScript文件中,并对其进行了命名,但输出文本文件完全空白.

'use strict';

let fs = require('fs');
let PDFParser = require("pdf2json");

let pdfParser = new PDFParser();

pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError) );
pdfParser.on("pdfParser_dataReady", pdfData => {
    fs.writeFile("./node_modules/pdf2json/test/F1040EZ.content.txt", pdfParser.getRawTextContent());
});

pdfParser.loadPDF("./node_modules/pdf2json/test/pdf/fd/form/F1040EZ.pdf");

这是我做错了吗?还是这对他们不起作用?在没有安装其他二进制文件的情况下,Nodejs的pdf文本转换器是否还有其他选择?

解决方案

首页文档有点错误!为了使这项工作简单地将PDFParser参数设置为null和1 <​​/p>

此作品有效:

var fs = require("fs");

// https://github.com/modesty/pdf2json
var PDFParser = require("./node_modules/pdf2json/PDFParser");
var pdfParser = new PDFParser(this,1);

pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError));
pdfParser.on("pdfParser_dataReady", pdfData => {
    console.log(pdfParser)
    fs.writeFile("./content.txt", pdfParser.getRawTextContent());
});

HTH -XDVarpunen

指向pdf2json中发布的链接: https://github.com/modesty/pdf2json/issues/76

I am following their "Code Example" guide on their github. https://github.com/modesty/pdf2json#code-example

In the example that says "Parse a PDF then write a .txt file (which only contains textual content of the PDF)", I copied and pasted the exact implementation into my a local JavaScript file and called it but the output text file was completely blank.

'use strict';

let fs = require('fs');
let PDFParser = require("pdf2json");

let pdfParser = new PDFParser();

pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError) );
pdfParser.on("pdfParser_dataReady", pdfData => {
    fs.writeFile("./node_modules/pdf2json/test/F1040EZ.content.txt", pdfParser.getRawTextContent());
});

pdfParser.loadPDF("./node_modules/pdf2json/test/pdf/fd/form/F1040EZ.pdf");

Is it something that I am doing wrong? Or does this not work on their part? Also are there any alternatives to pdf to text converters for Nodejs without additional binaries installed?

解决方案

The frontpage documentation is a bit wrong! In order to make this work simply set to PDFParser parameters null and 1

This one works:

var fs = require("fs");

// https://github.com/modesty/pdf2json
var PDFParser = require("./node_modules/pdf2json/PDFParser");
var pdfParser = new PDFParser(this,1);

pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError));
pdfParser.on("pdfParser_dataReady", pdfData => {
    console.log(pdfParser)
    fs.writeFile("./content.txt", pdfParser.getRawTextContent());
});

HTH -XDVarpunen

Link to issue in pdf2json: https://github.com/modesty/pdf2json/issues/76

这篇关于pdf2json给我一个空白的输出txt文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆