从HTML字符串中提取字符串 [英] Extract string from HTML String

查看：337 发布时间：2020/11/24 3:03:32 python html string

本文介绍了从HTML字符串中提取字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从html字符串中提取一个数字(我通常不知道该数字).

i want to extract a number from a html string (i usually do not know the number).

关键部分如下:

<test test="3" test="search_summary_figure WHR WVM">TOTAL : 286</test>
<tagend>

我想提取"286".我想做一些类似的事情，例如在"L:之后开始"，在<"之前停止. 我怎样才能做到这一点 ?提前非常感谢您.

And i want to extract the "286". I want to do something like "start after "L :" and stop before "<". How can i do this ? Thank you very much in advance.

推荐答案

如果字符串"TOTAL:number"是唯一的，则使用正则表达式首先搜索该子字符串，然后从中提取数字.

If the string "TOTAL : number" is unique then use a regular expression to first search this substring and then extract the number from it.

import re

string = 'test test="3" test="search_summary_figure WHR WVM">TOTAL : 286</test>'

reg__expr = r'TOTAL\s:\s\d+'  # TOTAL<whitespace>:<whitespace><number>
# find the substring
result = re.findall(reg__expr, string)
if result:

   substring = result[0]

   reg__expr = r'\d+'  # <number>
   result = re.findall(reg__expr, substring)
   number = int(result[0])

   print(number)

您可以在此处测试自己的正则表达式 https://regex101.com/

You can test your own regular expressions here https://regex101.com/

这篇关于从HTML字符串中提取字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从HTML字符串中提取字符串 [英] Extract string from HTML String

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

从HTML字符串中提取字符串 [英] Extract string from HTML String

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭