Python网络爬虫错误 [英] Python web crawler error

查看：103 发布时间：2019/6/7 19:32:28 Python

本文介绍了Python网络爬虫错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

来自urllib.request的

导入urlopen 
来自urllib import parse 
 import re 
 print（输入你想要抓取的网址..）
 myurl = input（the url）
 def getdata（myurl）：
 for i in re.findall（'''href = [']（。[ ^'] +）[']'''，urllib.urlopen（sys.argv [1]）。read（），re.I）：
打印i 
为ee in re。 findall（'''href = [']（。[^'] +）[']'''，urllib.urlopen（i）.read（），re.I）：
 print ee

错误我得到了

C：\ Users \ user-pc\Documents\Python> python urlcrawler.py

文件urlcrawler.py，第7行

for i in re.findall（'''href = [']（。[^'] +）[']'''，urllib.urlopen（sys.argv [1]

）。read（），re.I）：

IndentationError：预期缩进块

我尝试过：

尝试搜索错误但没有得到近似结果

解决方案

您的代码没有正确缩进，如错误消息中所述。忘记搜索并阅读Python文档： 3。 Python的非正式简介 - Python 3.7.0文档 [ ^ ]

错误很好地解释了问题。 for循环的主体必须缩进。

 for i in re.findall（'''href = [']（。 [^'] +）[']'''，urllib.urlopen（sys.argv [1]）。read（），re.I）：
打印i 
 for e re in re .findall（'''href = [']（。[^'] +）[']'''，urllib.urlopen（i）.read（），re.I）：
 print EE

from urllib.request import urlopen
from urllib import parse
import re
print ("Enter the URL you wish to crawl..")
myurl = input("the url")
def getdata(myurl):
for i in re.findall('''href=["'](.[^"']+)["']''', urllib.urlopen(sys.argv[1]).read(), re.I):
print i
for ee in re.findall('''href=["'](.[^"']+)["']''', urllib.urlopen(i).read(), re.I):
print ee

error i am getting

C:\Users\user-pc\Documents\Python>python urlcrawler.py
File "urlcrawler.py", line 7
for i in re.findall('''href=["'](.[^"']+)["']''', urllib.urlopen(sys.argv[1]
).read(), re.I):
IndentationError: expected an indented block

What I have tried:

tried searching the error but not getting approx results

解决方案

Your code is not indented correctly, as reported in the error message. Forget searching and read the Python documentation: 3. An Informal Introduction to Python — Python 3.7.0 documentation[^]

The error explains the problem pretty well. The body of a for loop must be indented.

for i in re.findall('''href=["'](.[^"']+)["']''', urllib.urlopen(sys.argv[1]).read(), re.I):
    print i
    for ee in re.findall('''href=["'](.[^"']+)["']''', urllib.urlopen(i).read(), re.I):
        print ee

这篇关于Python网络爬虫错误的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python网络爬虫错误 [英] Python web crawler error

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

Python网络爬虫错误 [英] Python web crawler error

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭