使用Algolia搜索(提取文本)PDF文件 [英] Searching (extracting text) PDF files with Algolia

查看：146 发布时间：2020/8/22 18:52:12 php search algolia

本文介绍了使用Algolia搜索(提取文本)PDF文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对于拥有大量PDF文件的客户来说，这只是一个推测性想法.

This is just a speculative idea for a client who has a lot of PDF files.

阿尔及利亚在其常见问题解答中说，要搜索PDF文件，您首先需要从文件中提取文本.您将如何处理?

Algolia say in their FAQs that to search PDF files you first need to extract the text from the file. How would you go about this?

我设想系统正常运行的方式是:

The way I envisage the a system working would be:

客户端通过CMS上传PDF
CMS调用某些服务/程序来提取文字
阿尔及利亚对提取的内容进行索引并以某种方式链接到原始PDF

Client uploads PDF via CMS
CMS calls some service / program to extract the text
Algolia indexes the extracted and it's somehow linked to the original PDF

这将是一个自动化系统，因为客户端不必告诉它建立索引. 它将用PHP构建，可能是Laravel在Ubuntu上运行.

It would need to be an automated system as the client shouldn't have to tell it to index. It would be built in PHP, probably Laravel running on Ubuntu.

什么软件/服务可以从PDF中提取文本，将其链接"到PDF文件需要魔术吗?

What software / service could do the text extraction from the PDFs and is any magic needed to 'link' this with the PDF file?

我也很高兴就可能解决此问题的其他搜索服务提出建议.

I'm also happy to have suggestions on other search services which may handle this.

使用Algolia搜索(提取文本)PDF文件 [英] Searching (extracting text) PDF files with Algolia

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

使用Algolia搜索(提取文本)PDF文件 [英] Searching (extracting text) PDF files with Algolia

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭