计算txt,doc,PDF中的单词数 [英] Count Number of Word in txt,doc,PDF

查看:91
本文介绍了计算txt,doc,PDF中的单词数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好小组成员,
当我通过浏览并单击上传"按钮上传文档时,这三个文档(.txt文件,word文档或PDF)中的文档可能是我要获取的单词总数.我编写的代码向我显示了正确的单词在文本文件中退出,但是如果我上载Word文档或PDF,则显示计数错误.下面是代码.希望有人会回应我.

Hello Group Members,
When I upload the document by browsing it and clicking upload button what ever the document may be among this three documents(.txt file,word document or PDF) I have to get total number of words .I have written code it shows me correct words exited in text file,But if I upload word document or PDF it shows me wrong count .Below is the code.Hope some one will respond to me.

int countWords = 0; string line;

         try
        {
              using (StreamReader str = new StreamReader(FileUpload1.PostedFile.InputStream))
                 {
                     while ((line = str.ReadLine()) != null)
                      {
                            string[] sttr = line.Split(' ');
                            countWords += sttr.Length;
                      }
                Response.Write("<Script>alert(" + countWords + ")</script>");
                 }
          }

        catch (Exception ee)
          {
                Response.Write("<script>alert(" + ee.Message+ ")</script>");
          }

推荐答案

简单的原因是Word和PDF文件不仅仅包含.TXT文件那样的文本,它们还包含一个许多其他信息(对于Word文档而言)可能不包含太多(如果有的话)可读文本,具体取决于创建它们的Word版本.

如果您需要获取使用特定应用程序创建的文档的字数统计,则yoiu要么需要使用该应用程序打开它,要么编写/查找一个合适的阅读器插件来帮助您.恐怕仅以文本和阅读行的形式打开文件根本无法完成工作.
The simple reason is that Word and PDF files doe not just contain the text in the way that a .TXT file does - they contain a lot of other information, and in the case of word documents, may not contain much if any readable text depending on teh version of Word that they were created with.

If you need to get the word count for a document created with a specific application, then yoiu either need to open it with that application, or write / find an appropriate reader add-in that can do it for you. Just opening the file as text and reading lines will not do the job at all, I''m afraid.


要将上传的文件保存在默认位置,您必须明确指定文件路径为在下面.

To save a uploaded file in default location you have to give file path explicitly as below.

FileUpload.SaveAs(filepath);



//[文件路径包含您的默认文件路径]

要不计算单词数,请参考以下代码...这可能会有所帮助..code在c#中...



//[file path contains your default file path]

To count no of words refer the following code...it may help..code is in c# ...

StreamReader sr=new StreamReader(filepath);

String all=sr.ReadToEnd();

sr.Close();

char[] chararray = {' ' , ',' , '\n' , '\t' , '\f'};

//  [put all possible chars which all are used to separate a word in this char array]

String[] strarray = all.Split(chararray);

int wordcount = strarray.Length;



wordcount中没有单词...

但是,据我所知,excel,pdf,powerpoint文档都包含一些额外的char和定界符以及其他基于其格式的特殊char ...

使用上面的代码可以轻松读取txt和rtf文件.



wordcount contains no of words...

But up to my Knowledge word,excel,pdf ,power point documents are contains some extra char and delimiters and other special char based on their format...

txt and rtf file can be readed easily using above code.


这篇关于计算txt,doc,PDF中的单词数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆