如何从 HTML 字符串中提取 IP 地址? [英] How to extract an IP address from an HTML string?

查看：57 发布时间：2021/6/25 19:39:34 python regex string

本文介绍了如何从 HTML 字符串中提取 IP 地址?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我想使用 Python 从字符串(实际上是一行 HTML)中提取 IP 地址.

<预><代码>>>>s = "<html><head><title>当前 IP 检查</title></head><body>当前 IP 地址:165.91.15.131</body></html>"

-- '165.91.15.131' 是我想要的！

我尝试使用正则表达式，但到目前为止我只能得到第一个数字.

<预><代码>>>>进口重新>>>ip = re.findall( r'([0-9]+)(?:\.[0-9]+){3}', s )>>>ip['165']

但我对 reg-expression 没有深入的了解；上面的代码是从网上的其他地方找到并修改的.

解决方案

移除您的捕获组:

ip = re.findall( r'[0-9]+(?:\.[0-9]+){3}', s )

结果:

['165.91.15.131']

注意事项:

如果您正在解析 HTML，最好查看 BeautifulSoup.
您的正则表达式匹配了一些无效的 IP 地址，例如 0.00.999.9999.这不一定是问题，但您应该意识到这一点并可能处理这种情况.您可以将 + 更改为 {1,3} 以进行部分修复，而不会使正则表达式过于复杂.

I want to extract an IP address from a string (actually a one-line HTML) using Python.

>>> s = "<html><head><title>Current IP Check</title></head><body>Current IP Address: 165.91.15.131</body></html>"

-- '165.91.15.131' is what I want!

I tried using regular expressions, but so far I can only get to the first number.

>>> import re
>>> ip = re.findall( r'([0-9]+)(?:\.[0-9]+){3}', s )
>>> ip
['165']

But I don't have a firm grasp on reg-expression; the above code was found and modified from elsewhere on the web.

解决方案

Remove your capturing group:

ip = re.findall( r'[0-9]+(?:\.[0-9]+){3}', s )

Result:

['165.91.15.131']

Notes:

If you are parsing HTML it might be a good idea to look at BeautifulSoup.
Your regular expression matches some invalid IP addresses such as 0.00.999.9999. This isn't necessarily a problem, but you should be aware of it and possibly handle this situation. You could change the + to {1,3} for a partial fix without making the regular expression overly complex.

这篇关于如何从 HTML 字符串中提取 IP 地址?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文