如何从PHP的HTML列表中提取结构化文本? [英] How can I extract structured text from an HTML list in PHP?

查看:121
本文介绍了如何从PHP的HTML列表中提取结构化文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个字符串:

<ul>
  <li id="1">Page 1</li>
  <li id="2">Page 2
    <ul>
      <li id="3">Sub Page A</li>
      <li id="4">Sub Page B</li>
      <li id="5">Sub Page C
        <ul>
          <li id="6">Sub Sub Page I</li>
        </ul>
      </li>
    </ul>
  </li>
  <li id="7">Page 3
    <ul>
      <li id="8">Sub Page D</li>
    </ul>
  </li>
  <li id="9">Page 4</li>
</ul>

并且我想用PHP爆炸所有信息,并使之像这样:

and I want to explode every information with PHP and make it like:

----------------------------------
| ID | ORDER | PARENT | CHILDREN |
----------------------------------
|  1 |   1   |   0   |     0     |
|  2 |   2   |   0   |   3,4,5   |
|  3 |   1   |   2   |     0     |
|  4 |   2   |   2   |     0     |
|  5 |   3   |   2   |     6     |
|  6 |   1   |   5   |     0     |
|  7 |   3   |   0   |     8     |
|  8 |   1   |   7   |     0     |
|  9 |   4   |   0   |     0     |
----------------------------------

有关其他信息,此列表对我而言意味着以下含义:

For extra information, this is what this list means for me:

ID 1是第1个(第1页),有0个父母和0个孩子,

ID 1 is 1st (Page 1) and has 0 parents and 0 children,

ID 2是第二个(第2页),有0个父母和孩子ID 3、4、5,

ID 2 is 2nd (Page 2) and has 0 parents and children IDs 3,4,5,

ID 3是第1个(子页面A),并且具有父ID 2和0个孩子,

ID 3 is 1st (Sub Page A) and has parent ID 2 and 0 children,

ID 4是第二个(子页面B),并且具有父ID 2和0个孩子,

ID 4 is 2nd (Sub Page B) and has parent ID 2 and 0 children,

ID 5是第3个(子页面C),并且具有父ID 2和子ID 6,

ID 5 is 3rd (Sub Page C) and has parent ID 2 and children ID 6,

ID 6是第1个(子页面I),其父ID 5和0个孩子,

ID 6 is 1st (Sub Page I) and has parent ID 5 and 0 children,

ID 7是第3页(第3页),有0个父母和孩子ID 8,

ID 7 is 3th (Page 3) and has 0 parents and children ID 8,

ID 8是第1个(子页面I),其父ID 7和0个孩子,

ID 8 is 1st (Sub Page I) and has parent ID 7 and 0 children,

ID 9是第4页(第4页),有0个父母和0个孩子.

ID 9 is 4th (Page 4) and has 0 parents and 0 children.

如果这太难了,谁能建议使用另一种方法从此字符串中获取该信息?

If this is too tough, can anyone sugest how to get that info from this string with another method?

推荐答案

这不是字符串",而是HTML.您需要使用HTML解析器,例如 DOMDocument

That's not "a string", it's HTML. You need to use an HTML parser like DOMDocument or simple_html_dom.

http://htmlparsing.com/php.html

这篇关于如何从PHP的HTML列表中提取结构化文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆