如何从pdf文件中读取特定值,如日期,名称,电子邮件ID等 [英] How to read particular values from pdf files, like date, name , email id etc
问题描述
如何使用asp.net c#从pdf文件中读取特定值,例如日期,姓名,电子邮件ID等。
请帮忙。
谢谢
我的尝试:
我尝试了谷歌的一些代码,但它是为了读取所有值,但我想要一些特定的值。
static void Main(string [] args)
{
string pdfdata = ExtractTextFromPdf(@report_grid.pdf);
Console.WriteLine(pdfdata);
System.Console.ReadLine();
}
public static string ExtractTextFromPdf(字符串路径)
{
使用(PdfReader reader = new PdfReader(path))
{
StringBuilder text = new StringBuilder();
for(int i = 1; i< = reader.NumberOfPages; i ++)
{
text.Append(PdfTextExtractor.GetTextFromPage(reader,i));
}
返回text.ToString();
}
}
引用:我尝试了一些来自谷歌的代码,但它是用于读取所有值,但我想要一些特定的值我想它应该是两个-steps方法:
- 检索所有文本
- 从此类文本中提取有意义的信息
Hi,
How to read particular values from pdf file using asp.net c#, like date, Name , email id etc.
Please help.
Thanks
What I have tried:
I tried some code from google but it is for reading all values, but i want some particular values.
static void Main(string[] args)
{
string pdfdata = ExtractTextFromPdf(@"report_grid.pdf");
Console.WriteLine(pdfdata);
System.Console.ReadLine();
}
public static string ExtractTextFromPdf(string path)
{
using (PdfReader reader = new PdfReader(path))
{
StringBuilder text = new StringBuilder();
for (int i = 1; i <= reader.NumberOfPages; i++)
{
text.Append(PdfTextExtractor.GetTextFromPage(reader, i));
}
return text.ToString();
}
}解决方案Quote:I tried some code from google but it is for reading all values, but i want some particular values
I suppose it should be a two-steps approach:
- retrieve all the text
- extract the meaningful information from such text
这篇关于如何从pdf文件中读取特定值,如日期,名称,电子邮件ID等的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!