使用 Algolia 搜索(提取文本)PDF 文件 [英] Searching (extracting text) PDF files with Algolia

查看：39 发布时间：2021/11/26 23:46:31 php search algolia

本文介绍了使用 Algolia 搜索(提取文本)PDF 文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对于拥有大量 PDF 文件的客户来说，这只是一个推测性的想法.

This is just a speculative idea for a client who has a lot of PDF files.

Algolia 在他们的常见问题解答中说，要搜索 PDF 文件，您首先需要从文件中提取文本.你会怎么做?

Algolia say in their FAQs that to search PDF files you first need to extract the text from the file. How would you go about this?

我设想的系统工作方式是:

The way I envisage the a system working would be:

客户通过 CMS 上传 PDF
CMS 调用一些服务/程序来提取文本
Algolia 将提取的索引编入索引，并且不知何故链接到原始 PDF

它需要是一个自动化系统，因为客户端不应该告诉它索引.它将用 PHP 构建，可能是在 Ubuntu 上运行的 Laravel.

It would need to be an automated system as the client shouldn't have to tell it to index. It would be built in PHP, probably Laravel running on Ubuntu.

什么软件/服务可以从 PDF 中提取文本，是否需要任何魔法将其与 PDF 文件链接"?

What software / service could do the text extraction from the PDFs and is any magic needed to 'link' this with the PDF file?

我也很高兴对其他可能处理此问题的搜索服务提出建议.

I'm also happy to have suggestions on other search services which may handle this.

使用 Algolia 搜索(提取文本)PDF 文件 [英] Searching (extracting text) PDF files with Algolia

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

使用 Algolia 搜索(提取文本)PDF 文件 [英] Searching (extracting text) PDF files with Algolia

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭