我想正确阅读文件（pdf）的内容 [英] i want to read the content of file(pdf) correctly

查看：75 发布时间：2019/6/12 19:38:24 C#

本文介绍了我想正确阅读文件（pdf）的内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

实际上我开发了一个winform应用程序，应用程序读取内容很好但是使用相同的代码读取pdf files.it的工作但是内容

如횶땐择몎态㺛갿籕뚜뚜靐塥塥塥ࠧ뫳뫳뫳俫俫뫜ڤ଻혫᭍떼떼떼 ꇨ㯽☐녴샯﹯蛪髚☐㉾翐☐䜓☐幄뤄ꇥል貑꒥⣔☐⭸쨧렅½캽泜빳燗⁇圷춪⏖뚍鳀馅ꊾᴦ뗖诒Ꝅ퍃怮镫좽聗逋麟☐ധш♉℩邝䥎ᒼ翏狲Ꮘ쮛旾睬谭칺馵ว퀑뒷ꞹ䰛涉죢㐆莲捥قح泺跛ᬹ䲷妞ఞ。

本内容不理解。这个结果将使用断点追踪

代码如

Actually i have develop one winform application that application reads the content

file(.txt) very well but using same code read the pdf files.it's working but content

like as "횶땐擇몎态㺛갿籕因뚜靐⨎ᴪ䣌塥並ࠧ町뫳俫黶뫜ﭪ଻혫᭍떼㌵ꇨ㯽☐녴샯﹯蛪髚☐㉾翐☐䜓☐幄뤄ꇥል貑꒥⣔☐⭸쨧렅½캽泜빳燗⁇圷춪⏖뚍鳀餡ꊾᴦ뗖詒Ꝅ퍃怮鐙좽聗逋麟☐ധш♉℩邝䥎ᒼ翏狲Ꮘ쮛旾睬譚칺馵ว퀑뒷ꞹ䰛涉죢㐆蓮捥ﳂ濼跛ᬹ䲷妞ఞ".
this content is not understanding.this result will be trace out using breaking points

that code like as

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.IO;
using System.Collections;
using System.Windows.Forms;

namespace test
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }
      public static string StringFromBytes(byte[] arr)
        {
            char[] ch = new char[arr.Length / 2];
            for (int i = 0; i < ch.Length; ++i)
            {
                ch[i] = (char)((int)arr[i * 2] + (((int)arr[i * 2 + 1]) << 8));
            }
            return new String(ch);
        }

        private void button1_Click(object sender, EventArgs e)
        {
            ArrayList fileStatistics = new ArrayList();
            String datasetPath = @"D:\Data Sets\Enron";
            DirectoryInfo d = new DirectoryInfo(datasetPath);
            FileInfo[] files = d.GetFiles("*.pdf");
            MessageBox.Show(files.Length.ToString());

            foreach (FileInfo file in files)
            {                
                    //create instance of data class
                    fileAtt f = new fileAtt();

                    f.fFullName = file.FullName;
                    f.fName = file.Name;
                    f.FileSize = file.Length;
                    f.fExtension = file.Extension;
                    byte[] bytes = File.ReadAllBytes(file.FullName);
                    f.content    =Form1.StringFromBytes(bytes);
                   //f.content = Encoding.ASCII.GetString(bytes);
                   f.lastaccesstime = file.LastAccessTime;                
                    fileStatistics.Add(f);
                 //   StreamReader r = new StreamReader(datasetPath);
                 //foreach
                    
                
            }
            gvStatistics.DataSource = fileStatistics;

        }
        }
    }

fileatt属性类：

fileatt is property class:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace test
{
    class fileAtt
    {
        public long FileSize { get; set; }
        public string fName { get; set; }
        public string fFullName { get; set; }
        public string fExtension { get; set; }

        public string content { get; set; }

        public DateTime lastaccesstime { get; set; }
    }
}

i想要正确阅读pdf的内容即内容由用户理解。这是

我的要求。我想根据上面的代码解决方案。

请帮助我。

谢谢你

i want to read the content of pdf's correctly i.e content understand by user.this is

my requirements.i want solution according to the above code.

pls help me.

thank u

我想正确阅读文件（pdf）的内容 [英] i want to read the content of file(pdf) correctly

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

我想正确阅读文件（pdf）的内容 [英] i want to read the content of file(pdf) correctly

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭