php:从html获取纯文本-simplehtmldom或php strip_tags? [英] php: Get plain text from html - simplehtmldom or php strip_tags?

查看:115
本文介绍了php:从html获取纯文本-simplehtmldom或php strip_tags?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找从html获取纯文本的方法.我应该选择哪一个,php strip_tags 解决方案

出于上述原因,您可能应该使用smiplehtmldom,并且strip_tags也可能使脚本/样式块中包含非文本元素(例如javascript或css)

您还可以从未显示的元素中过滤文本(内联样式= display:none)

也就是说,如果html很简单,那么strip_tags可能会更快,并且可以完成相同的任务

I am looking at getting the plain text from html. Which one should I choose, php strip_tags or simplehtmldom plaintext extraction?

One pro for simplehtmldom is support of invalid html, is that sufficient in itself?

解决方案

You should probably use smiplehtmldom for the reason you mentioned and that strip_tags may also leave you non-text elements like javascript or css contained within script/style blocks

You would also be able to filter text from elements that aren't displayed (inline style=display:none)

That said, if the html is simple enough, then strip_tags may be faster and will accomplish the same task

这篇关于php:从html获取纯文本-simplehtmldom或php strip_tags?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆