Python Scrapy 获取 HTML <script>标签 [英] Python Scrapy Get HTML <script> tag
本文介绍了Python Scrapy 获取 HTML <script>标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个项目,我需要 html 代码中的 get 脚本.
I have a project and i need the get script in html code.
<script>
(function() {
... / More Code
Level.grade = "2";
Level.level = "1";
Level.max_line = "5";
Level.cozum = 'adım 12\ndön sağ\nadım 13\ndön sol\nadım 11';
... / More Code
</script>
我怎么只得到"adım 12\ndön sağ\nadım 13\ndön sol\nadım 11 "这个代码?
How i get only " adım 12\ndön sağ\nadım 13\ndön sol\nadım 11 " this code?
感谢帮助
推荐答案
使用正则表达式来做到这一点
Use Regex to do that
首先像这样获取 SCRIPT 标签的内容
First grab the content of that SCRIPT tag like
response.css("script").extract_first()
然后使用这个正则表达式
And then use this regex
(Level\.cozum = )(.*?)(\;)
在此处查看演示 https://regex101.com/r/YxHRmR/1
这是代码
import re
regex = r"(Level\.cozum = )(.*?)(\;)"
test_str = ("<script>\n"
" (function() {\n"
" ... / More Code\n"
" Level.grade = \"2\";\n\n"
" Level.level = \"1\";\n\n"
" Level.max_line = \"5\";\n\n"
" Level.cozum = 'adım 12\\ndön sağ\\nadım 13\\ndön sol\\nadım 11'; \n"
"... / More Code\n"
"</script>")
matches = re.findall(regex, test_str, re.MULTILINE)
print(matches)
这篇关于Python Scrapy 获取 HTML <script>标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文