是否可以使用仅搜索我上传的PDF的搜索引擎创建网站？ [英] Is it possible to create a website with a search engine that only searches the PDFs I upload?

查看：122 发布时间：2019/6/13 4:12:04 OCR

本文介绍了是否可以使用仅搜索我上传的PDF的搜索引擎创建网站？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试为教育视频库创建用户界面。视频位于其他地方，我想创建一个用户友好的网站，并拥有广泛的搜索引擎，但仅限于视频中涵盖的内容。目前，我手动使用20-30个关键字标记每个视频链接。但是，我希望如果我能弄清楚如何使用每个视频的pdf成绩单作为可搜索的文本，标记将是自动的，并产生更好的搜索引擎。我知道有很多OCR网站，但我没有找到任何自定义OCR搜索引擎的个人网站。这可能吗？

I am trying to create the user interface for an educational video library. The videos are housed somewhere else and I want to create a site that will be user friendly and have an extensive search engine, but only for the content covered in the videos. At the moment I am manually tagging each video link with 20-30 keywords. But, I am hoping if I can figure out how to use the pdf transcripts of each video as searchable text, the tagging will be automatic and result in a better search engine. I know there are many OCR websites out there but I haven't found any personal sites with custom OCR search engines. Is this possible?

推荐答案

OCR？听起来你需要ITextSharp。查看他们的SourceFourge页面并阅读有关如何使用它的一些内容。这是一个简单的片段，可以帮助您从PDF文件中提取一些文本：

itextsharp读取pdf文件 [ ^ ]

OCR? Sounds like you need ITextSharp. Check out their SourceFourge page and do some reading up on how to use it. Here's a simple snippet to get you started with extracting some text from a PDF file:

itextsharp read pdf file[^]

public string ParsePdf(string fileName)
{
  if (!File.Exists(fileName))
    throw new FileNotFoundException("fileName");
  using (PdfReader reader = new PdfReader(fileName))
  {
    StringBuilder sb = new StringBuilder();
 
    ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
    for (int page = 0; page < reader.NumberOfPages; page++)
    {
      string text = PdfTextExtractor.GetTextFromPage(reader, page + 1, strategy);
      if (!string.IsNullOrWhitespace(text))
      {
        sb.Append(Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(text))));
      }
    }
 
    return sb.ToString();
  } 
 }
}

这篇关于是否可以使用仅搜索我上传的PDF的搜索引擎创建网站？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

是否可以使用仅搜索我上传的PDF的搜索引擎创建网站？ [英] Is it possible to create a website with a search engine that only searches the PDFs I upload?

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

是否可以使用仅搜索我上传的PDF的搜索引擎创建网站？ [英] Is it possible to create a website with a search engine that only searches the PDFs I upload?

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭