如何比较多个PDF文件并获得C#中页面数最少的PDF? [英] How do compare multiple PDF files and get the PDF with the least number of pages in C#?
问题描述
我正在检索目录中所有.pdf文件的列表,我有一个函数来获取一个pdf的页面数。
I am retrieving the list of all .pdf files in a directory and I have a function to get the number of pages for one pdf.
//List of all PDF files
string[] filePaths = Directory.GetFiles(cboSource.Text, "*.pdf", SearchOption.AllDirectories);
MessageBox.Show(String.Join(Environment.NewLine, filePaths));
//Get the number of pages in a PDF file
public int GetNumberOfPdfPages(string fileName)
{
using (StreamReader sr = new StreamReader(File.OpenRead(fileName)))
{
Regex regex = new Regex(@"/Type\s*/Page[^s]");
MatchCollection matches = regex.Matches(sr.ReadToEnd());
return matches.Count;
}
}
请忽略MessageBox,因为我刚用它来查看值是否正确。
现在,我想获得
Please ignore the MessageBox as I have just used it to see whether the values are correct.
Now, I want to get the name/path of that one PDF that has the least number of pages in the total collection in
string[] filePaths
。
请帮忙。
问候
我尝试过:
我已经在我的问题中展示了我所做的一切。我没有尝试过任何东西,因为我不确定什么以及如何继续
.
Please help.
Regards
What I have tried:
I have shown all I that I have done in my question. I have not tried anything apart from it as I am unsure of what and how to proceed
推荐答案
首先:你的页面计数方法不可靠。 />
例如它给了我0的PDF文件实际上包含超过1000页。
如果你有一个可靠的页数方法
你必须收集使用如下模型的文件名和页数:
First of all: Your method for counting pages is not reliable.
e.g. it gave me 0 for a PDF file actually containing over 1000 pages.
If you once have a reliable page count method
you must collect filename and page count by using a model like this:
public class PdfFileInfo
{
public string Filename { get; set; }
public int PageCount { get; set; }
}
例如
e.g.
private void GetPdfFiles(string folder)
{
var pdfFileInfos = new List<PdfFileInfo>();
var filePaths = Directory.GetFiles(folder, "*.pdf", SearchOption.AllDirectories);
foreach (var filePath in filePaths)
{
pdfFileInfos.Add(new PdfFileInfo
{
Filename = filePath,
PageCount = GetNumberOfPdfPages(filePath)
});
}
pdfFileInfos = pdfFileInfos.OrderBy(x => x.PageCount).ToList();
if (pdfFileInfos.Count > 1)
{
var result = pdfFileInfos[0];
MessageBox.Show(
{result.Filename}有{result.PageCount}页面。);
}
}
"{result.Filename} has {result.PageCount} pages."); } }
这篇关于如何比较多个PDF文件并获得C#中页面数最少的PDF?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!