如何使用pdf.js获取pdf标题? [英] How to get pdf title using pdf.js?

查看:510
本文介绍了如何使用pdf.js获取pdf标题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题是:如何使用pdf.js获取pdf文件的名称?我正在运行来自节点的pdf.js示例的变体,我想知道是否有可能完全获得它.我一直在搜索pdf.js的docs/source,但是找不到任何明显的东西.我正在使用此代码,该代码(到目前为止)显示在给定文件夹(在这种情况下,此代码的运行目录)中找到的每个文件的页数:

The question is: How can I get the name of the pdf file using pdf.js? I'm running a variation of a pdf.js example from node, and I was wondering if it's at all possible to get it. I've been searching through pdf.js's docs/source, but couldn't find anything obvious. I'm using this code, which (so far) shows the number of pages of each file found on a given folder (in this case, the directory this code is being run from):

var fs = require('fs');
var glob = require('glob');

global.window = global;
global.navigator = { userAgent: "node" };
global.PDFJS = {};
global.DOMParser = require('./domparsermock.js').DOMParserMock;

require('../../build/singlefile/build/pdf.combined.js');
glob("**/*.pdf", function (er, files) {
for(var i = 0; i < files.length; i++){
var data = new Uint8Array(fs.readFileSync(files[i]));
PDFJS.getDocument(data).then(function (doc) {
      var numPages = doc.numPages;
      console.log('Number of Pages: ' + numPages);
      console.log();
    }).then(function () {
      console.log('# End of Document');
    }, function (err) {
      console.error('Error: ' + err);
    });
   }
});

我以为文件名是在doc对象中作为属性或类似名称的,但在这里似乎不是这种情况,而且我在文档中找不到任何有关此的信息.这里有我想念的地方吗?

I thought the name of the file was in the doc object as an attribute or something like that, but that doesn't seem to be the case here, and I couldn't find anything about this in the docs. Is there something I'm missing or doing wrong here?

推荐答案

我已修复它:)代码现在看起来像这样:

I fixed it :) the code looks like this now:

var fs = require('fs');
var glob = require('glob');

global.window = global;
global.navigator = { userAgent: "node" };
global.PDFJS = {};
global.DOMParser = require('./domparsermock.js').DOMParserMock;

require('../../build/singlefile/build/pdf.combined.js');
glob("**/*.pdf", function (er, files) {

//this is the essential change: use a forEach() instead of the for loop
files.forEach(function(file){
    var data = new Uint8Array(fs.readFileSync(file));
    PDFJS.getDocument(data)
      .then(function (doc) {
        var numPages = doc.numPages;
        console.log('File name: ' + file + ', Number of Pages: ' + numPages);
        console.log();
      });
  });
});

希望它对某人有帮助,并感谢您的迅速答复:)

Hope it helps someone, and thanks for the quick replies :)

这篇关于如何使用pdf.js获取pdf标题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆