阅读罗马页数 [英] Read roman page number of page

查看:75
本文介绍了阅读罗马页数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Adobe Reader中,电子书的第一页可以具有罗马格式的页码,如下图所示

in Adobe Reader the first pages of a ebook can have roman format page number as shown in attached image below

图片: http://i.stack.imgur.com/GSm0Q.jpg

我想用iText读出这些页码(而不是索引的页码),但是我不知道应该使用哪些属性(标签或注释.).我已经可以使用PdfReader打开文件,遍历所有页面,但是不知道这些罗马数字应该访问什么

I would like to read these page numbers out (not the indexed page number) with iText but I don't know which properties (labels or annotations..) I should use. I could already open file with PdfReader, loop through all pages but have no idea what I should access for these roman numbers

using (Stream pdfStream = new FileStream(sourceFileName, FileMode.Open))
{
    PdfReader pdfReader = new PdfReader(pdfStream);
    for (int index = 1; index <= pdfReader.NumberOfPages; index++)
    {

    }
}

谢谢.

推荐答案

您正在寻找 page_labels.pdf ,其中包含页面编号如下:

You are looking for the PageLabelExample. In this example, we have a PDF, page_labels.pdf that has pages numbered like this:

listPageLabels()方法中,我们创建带有所有页面标签的txt文件:

In the listPageLabels() method, we create a txt file with all the page labels:

public void listPageLabels(String src, String dest) throws IOException {
    // no PDF, just a text file
    PrintStream out = new PrintStream(new FileOutputStream(dest));
    PdfReader reader = new PdfReader(src);
    String[] labels = PdfPageLabels.getPageLabels(reader);
    for (int i = 0; i < labels.length; i++) {
        out.println(labels[i]);
    }
    out.flush();
    out.close();
    reader.close();
}

结果如下:

A
B
1
2
3
Movies-4
Movies-5
Movies-6
Movies-7
Movies-8

如果要使用iTextSharp示例,请查看以下方法:

If you want an iTextSharp example, take a look at this method:

/**
 * Reads the page labels from an existing PDF
 * @param src the existing PDF
 */
public string ListPageLabels(byte[] src) {
    StringBuilder sb = new StringBuilder();
    String[] labels = PdfPageLabels.GetPageLabels(new PdfReader(src));
    for (int i = 0; i < labels.Length; i++) {
        sb.Append(labels[i]);
        sb.AppendLine();
    }
    return sb.ToString();
} 

更新

如评论部分所述: PdfPageLabels.cs

我不是C#开发人员,但这是GetPageLabels()方法的又脏又臭的版本,没有添加前缀:

I am not a C# developer, but this is a quick and dirty version of the GetPageLabels() method that doesn't add a prefix:

public static String[] GetPageLabels(PdfReader reader) {
    int n = reader.NumberOfPages;
    PdfDictionary dict = reader.Catalog;
    PdfDictionary labels = (PdfDictionary)PdfReader.GetPdfObjectRelease(dict.Get(PdfName.PAGELABELS));
    if (labels == null)
        return null;
    String[] labelstrings = new String[n];
    Dictionary<int, PdfObject> numberTree = PdfNumberTree.ReadTree(labels);    
    int pagecount = 1;
    char type = 'D';
    for (int i = 0; i < n; i++) {
        if (numberTree.ContainsKey(i)) {
            PdfDictionary d = (PdfDictionary)PdfReader.GetPdfObjectRelease(numberTree[i]);
            if (d.Contains(PdfName.ST)) {
                pagecount = ((PdfNumber)d.Get(PdfName.ST)).IntValue;
            }
            else {
                pagecount = 1;
            }
            if (d.Contains(PdfName.S)) {
                type = ((PdfName)d.Get(PdfName.S)).ToString()[1];
            }
            else {
                type = 'e';
            }
        }
        switch (type) {
        default:
            labelstrings[i] = "" + pagecount;
            break;
        case 'R':
            labelstrings[i] = RomanNumberFactory.GetUpperCaseString(pagecount);
            break;
        case 'r':
            labelstrings[i] = RomanNumberFactory.GetLowerCaseString(pagecount);
            break;
        case 'A':
            labelstrings[i] = RomanAlphabetFactory.GetUpperCaseString(pagecount);
            break;
        case 'a':
            labelstrings[i] = RomanAlphabetFactory.GetLowerCaseString(pagecount);
            break;
        case 'e':
            labelstrings[i] = "";
            break;
        }
        pagecount++;
    }
    return labelstrings;
}

这篇关于阅读罗马页数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆