使用PHP Simple HTML DOM解析器获取文本 [英] Get text with PHP Simple HTML DOM Parser

查看:122
本文介绍了使用PHP Simple HTML DOM解析器获取文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用PHP Simple HTML DOM解析器从网页获取文本. 我需要处理的页面是这样的:

i'm using PHP Simple HTML DOM Parser to get text from a webpage. The page i need to manipulate is something like:

<html>
<head>
<title>title</title>
<body>
<div id="content">
<h1>HELLO</h1>
Hello, world!
</div>
</body>
</html>

我需要获取h1元素和没有标签的文本. 要获得h1,我使用以下代码:

I need to get the h1 element and the text that has no tags. to get the h1 i use this code:

$html = file_get_html("remote_page.html");
foreach($html->find('#content') as $text){
echo "H1: ".$text->find('h1', 0)->plaintext;
}

但是其他文字呢? 我也在foreach中尝试过此方法,但我得到了全文:

But the other text? I also tried this into the foreach but i get the full text:

$text->plaintext;

但它还返回了H1标记...

but it returned also the H1 tag...

推荐答案

使用剥离标签,如@Peachy所指出的.但是,向其传递第二个参数<br>意味着字符串将忽略 <br>标签,这是不必要的.就您而言,

Use strip tags, as @Peachy pointed out. However, passing it a second argument <br> means string will ignore <br> tags, which is unnecessary. In your case,

<?php
    strip_tags($text);
?>

假设您只选择content ID中的内容,

就可以正常工作.

would work as you'd like, given that you are only selecting content in the content id.

这篇关于使用PHP Simple HTML DOM解析器获取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆