正则表达式，用于从HTML提取所有链接和锚文本 [英] Regexp for extracting all links and anchor texts from HTML

查看：109 发布时间：2020/5/27 2:41:33 php regex string html-parsing

本文介绍了正则表达式，用于从HTML提取所有链接和锚文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想要一个或多个可以运行的正则表达式:

I'd like one or more regexes that can:

1)获取大页面的html.

1) Take the html of a large page.

2)查找所有链接中包含的网址，例如:

2) Find the urls contained in all links, for example:

<a href="http://example1.com">Test 1</a>
<a class="foo" id="bar" href="http://example2.com">Test 2</a>
<a onclick="foo();" id="bar" href="http://example3.com">Test 3</a>

依此类推，无论href

3)提取所有链接的锚文本，例如在上述示例中，应返回"http://example1.com"和锚文本"Test 1"，然后返回"http://example2.com"和测试2"，依此类推.

3) Extract the anchor text of all links, for example in the above examples, it should return 'http://example1.com' and the anchor text 'Test 1', then 'http://example2.com' and 'Test 2', and so on.

正则表达式，用于从HTML提取所有链接和锚文本 [英] Regexp for extracting all links and anchor texts from HTML

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

正则表达式，用于从HTML提取所有链接和锚文本 [英] Regexp for extracting all links and anchor texts from HTML

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭