将pdf文件转换为excel表格 [英] convert pdf file into excel sheet

查看:135
本文介绍了将pdf文件转换为excel表格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个pdf,其中包含三个表,包含购买详细信息,我的任务是从pdf中提取所有3个表,并使用c#代码将每个表转换为excel表(三个excel表)。,我google'd 3天,所有我能找到的代码是从pdf中提取文本(但没有任何格式),我不能购买任何第三方工具,我需要一种方法至少提取正确的表格格式的文本,然后我将其转换为excel使用interop,或者代码直接转换为excel,无论解决方案是什么,我急需帮助。

解决方案

您可以查看下面提到的链接以获取更多信息info。



 ////////////////////////// ////////////////////////////////////////////////// //////////////////////// 
//这个例子是为在
中使用Microsoft Visual C#而设计的// Microsoft Visual Studio 2003年或以上。
//
// 1.应在PC上安装并激活Microsoft Excel 97或更高版本。
//
// 2.在使用此示例之前,请阅读Microsoft Excel 2003知识库中的这篇文章:
// http://support.microsoft.com/kb/320369/ en-us /
//此示例中提供了此问题的解决方法。
//
// 3.还应安装Universal Document Converter 5.2或更高版本。
//
// 4.使用项目添加对Microsoft Excel XX.0对象库和通用文档转换器类型库
//的引用添加参考菜单> COM选项卡。
// XX是您计算机上安装的Microsoft Office版本。
////////////////////////////////////////////// ////////////////////////////////////////////////// ////

使用System;
使用System.IO;使用UDC
;
使用Excel = Microsoft.Office.Interop.Excel; //使用Excel;在VS2003中

命名空间ExcelToPDF
{
class程序
{
static void PrintExcelToPDF(string ExcelFilePath)
{
//创建一个UDC对象并获取其接口
IUDC objUDC = new APIWrapper();
IUDCPrinter Printer = objUDC.get_Printers(Universal Document Converter);
IProfile Profile = Printer.Profile;

//使用Universal Document Converter API更改converterd文档的设置
Profile.PageSetup.ResolutionX = 600;
Profile.PageSetup.ResolutionY = 600;

Profile.FileFormat.ActualFormat = FormatID.FMT_PDF;

Profile.FileFormat.PDF.ColorSpace = ColorSpaceID.CS_TRUECOLOR;
Profile.FileFormat.PDF.Multipage = MultipageModeID.MM_MULTI;

Profile.OutputLocation.Mode = LocationModeID.LM_PREDEFINED;
Profile.OutputLocation.FolderPath = @c:\ UDC输出文件;
Profile.OutputLocation.FileName = @& [DocName(0)] - & [Date(0)] - & [Time(0)]。& [ImageType];
Profile.OutputLocation.OverwriteExistingFile = false;

Profile.PostProcessing.Mode = PostProcessingModeID.PP_OPEN_FOLDER;

//创建Excel的Application对象
Excel.Application ExcelApp = new Excel.ApplicationClass();

Object ReadOnly = true;
Object Missing = Type.Missing; //当我们不想传递值时,这将被传递

//如果您在计算机上运行英语版本的Excel,并且区域设置是针对非英语语言配置的,您必须在调用Excel方法之前设置CultureInfo。
System.Threading.Thread.CurrentThread.CurrentCulture = new System.Globalization.CultureInfo(en-US);
//从文件中打开文件
Excel.Workbook Workbook = ExcelApp.Workbooks.Open(ExcelFilePath,Missing,ReadOnly,Missing,Missing,Missing,Missing,Missing,Missing,Missing,Missing,Missing ,失踪,失踪,失踪);

//更改活动工作表设置并打印它
Excel.Worksheet Worksheet =(Excel.Worksheet)Workbook.ActiveSheet;
Excel.PageSetup PageSetup = Worksheet.PageSetup;

PageSetup.Orientation = Excel.XlPageOrientation.xlLandscape;

对象预览= false;
Worksheet.PrintOut(缺失,缺失,缺失,预览,通用文档转换器,丢失,丢失,丢失);

//关闭电子表格而不保存更改
Object SaveChanges = false;
Workbook.Close(SaveChanges,Missing,Missing);

//关闭Microsoft Excel
ExcelApp.Quit();
}

static void Main(string [] args)
{
string TestFilePath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory,TestFile.xls) ;
PrintExcelToPDF(TestFilePath);
}
}
}





欲了解更多信息: < br $> b $ b

http://social.msdn.microsoft.com/Forums/vstudio/en-US/a56b093b-2854-4925-99d5-2d35078c7cd3/converting-pdf -file-into-excel-file-using-c [ ^ ]



http://stackoverflow.com/questions/769246/xls-to-pdf-转换内网 [ ^ ]



使用PDF Extractor SDK将数据从PDF发票转换为C#中的Excel CSV文件



http://bytescout.com/products/developer/pdfextractorsdk/extract-from-pdf-to-excel-csv-in-csharp [ ^ ]



如何在.NET Framework中将PDF转换为Excel



http://www.moretechtips.net/2013/01/how-to-convert-pdf-to- excel-in-net.html [ ^ ]



我希望这会对你有所帮助。


< blockquote>为此,您需要使用一些第三方工具。 Becasue我不认为.NET支持。你可以在你的项目中使用很多thord party dll并实现欲望功能。其中的好处是:

引用:

PDF转换服务



iTextSharp



Excel to PDF .NET


I have a pdf which contains three tables,with the purchase details,my task is to extract all the 3 tables from the pdf and convert each into an excel sheet(three excel sheets)using c# code.,i google'd for 3days,all i could find was code to extract the text from pdf(but without any formatting),i cant purchase any third party tools,i need a way to atleast extract the text in proper table formats,then i will convert it to excel using interop,OR a code to directly convert to excel,whatever the solution is i need it urgently,pls help.

解决方案

You can check below mentioned links for more info.

////////////////////////////////////////////////////////////////////////////////////////////////////
// This example was designed for using in Microsoft Visual C# from 
// Microsoft Visual Studio 2003 or above.
//
// 1. Microsoft Excel 97 or above should be installed and activated on your PC.
//
// 2. Before using this example, please read this article from Microsoft Excel 2003 knowledge base:
//    http://support.microsoft.com/kb/320369/en-us/
//    A workaround for this issue is available in this example.
//
// 3. Universal Document Converter 5.2 or above should be installed, too.
//
// 4. Add references to "Microsoft Excel XX.0 Object Library" and "Universal Document Converter Type Library"
//    using the Project | Add Reference menu > COM tab.
//    XX is the Microsoft Office version installed on your computer.
////////////////////////////////////////////////////////////////////////////////////////////////////
 
using System;
using System.IO;
using UDC;
using Excel = Microsoft.Office.Interop.Excel; //using Excel; in VS2003
 
namespace ExcelToPDF
{
    class Program
    {
        static void PrintExcelToPDF(string ExcelFilePath)
        {
            //Create a UDC object and get its interfaces
            IUDC objUDC = new APIWrapper();
            IUDCPrinter Printer = objUDC.get_Printers("Universal Document Converter");
            IProfile Profile = Printer.Profile;
 
            //Use Universal Document Converter API to change settings of converterd document
            Profile.PageSetup.ResolutionX = 600;
            Profile.PageSetup.ResolutionY = 600;
 
            Profile.FileFormat.ActualFormat = FormatID.FMT_PDF;
 
            Profile.FileFormat.PDF.ColorSpace = ColorSpaceID.CS_TRUECOLOR;
            Profile.FileFormat.PDF.Multipage = MultipageModeID.MM_MULTI;
 
            Profile.OutputLocation.Mode = LocationModeID.LM_PREDEFINED;
            Profile.OutputLocation.FolderPath = @"c:\UDC Output Files";
            Profile.OutputLocation.FileName = @"&[DocName(0)] -- &[Date(0)] -- &[Time(0)].&[ImageType]";
            Profile.OutputLocation.OverwriteExistingFile = false;
 
            Profile.PostProcessing.Mode = PostProcessingModeID.PP_OPEN_FOLDER;
 
            //Create a Excel's Application object
            Excel.Application ExcelApp = new Excel.ApplicationClass();
 
            Object ReadOnly = true;
            Object Missing = Type.Missing; //This will be passed when ever we don’t want to pass value
 
            //If you run an English version of Excel on a computer with the regional settings are configured for a non-English language, you must set the CultureInfo prior calling Excel methods.
            System.Threading.Thread.CurrentThread.CurrentCulture = new System.Globalization.CultureInfo("en-US");
            //Open the document from a file
            Excel.Workbook Workbook = ExcelApp.Workbooks.Open(ExcelFilePath, Missing, ReadOnly, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing);
 
            //Change active worksheet settings and print it
            Excel.Worksheet Worksheet = (Excel.Worksheet)Workbook.ActiveSheet;
            Excel.PageSetup PageSetup = Worksheet.PageSetup;
 
            PageSetup.Orientation = Excel.XlPageOrientation.xlLandscape;
 
            Object Preview = false;
            Worksheet.PrintOut(Missing, Missing, Missing, Preview, "Universal Document Converter", Missing, Missing, Missing);
 
            //Close the spreadsheet without saving changes
            Object SaveChanges = false;
            Workbook.Close(SaveChanges, Missing, Missing);
 
            //Close Microsoft Excel
            ExcelApp.Quit();
        }
 
        static void Main(string[] args)
        {
            string TestFilePath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "TestFile.xls");
            PrintExcelToPDF(TestFilePath);
        }
    }
}



For more info:

http://social.msdn.microsoft.com/Forums/vstudio/en-US/a56b093b-2854-4925-99d5-2d35078c7cd3/converting-pdf-file-into-excel-file-using-c[^]

http://stackoverflow.com/questions/769246/xls-to-pdf-conversion-inside-net[^]

Convert data from PDF invoice to Excel CSV file in C# using PDF Extractor SDK

http://bytescout.com/products/developer/pdfextractorsdk/extract-from-pdf-to-excel-csv-in-csharp[^]

How To Convert PDF to Excel in .NET Framework

http://www.moretechtips.net/2013/01/how-to-convert-pdf-to-excel-in-net.html[^]

I hope this will help to you.


For that purpose you need to use some third party tool. Becasue i dont think .NET support that. There are many thord party dll available which you can use in your project and implement the desire functionlaity. SOme of them are:

Quote:

PDF Converter Services

iTextSharp

Excel to PDF .NET


这篇关于将pdf文件转换为excel表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆