首页
Python
Python-Urllib2等待页面加载以抓取数据

Python-Urllib2等待页面加载以抓取数据 [英] Python - Urllib2 Wait for page to load to scrape data

查看：232 发布时间：2020/9/29 4:37:20 python redirect web-scraping load captcha

本文介绍了Python-Urllib2等待页面加载以抓取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

首先，我想说的是，我不想使用Python 2.7.10未提供的任何库。相同的问题也发布在Stack Overflow上，但在Requests库中得到了回答。

Firstly, I'd like to say that I do not want to use any libraries that are not provided with Python 2.7.10. The same question was posted on Stack Overflow but was answered with the Requests library.

我有一个脚本，使用urllib2登录到Roblox.com。要在尝试登录之前检查是否存在验证码，我想做 check_captcha = re.findall（'recaptcha_image'，newlogin），但是roblox需要重定向到验证码登录页面，验证码必须加载到页面上。

I have a script that logs into Roblox.com using urllib2. To check if there is a captcha before I try to log in, I wanted to do check_captcha = re.findall('recaptcha_image', newlogin) but roblox needs to redirect to the captcha login page AND the captcha has to load onto the page.

因此，在继续执行 .read（）


So how can I make Python wait to redirect/load the page fully before I go ahead and .read() and scrape it.
推荐答案
这将等待10秒钟，然后才能读取：
This will wait 10 seconds before it reads it:
import urllib2
import time
url = 'Roblox url'
data = urllib2.urlopen(url)
time.sleep(10)
data = data.read()


                        这篇关于Python-Urllib2等待页面加载以抓取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文


        
            



        
        
            相关文章
            
                    
                        
                            Behat + Selenium 2等待页面加载;
                        
                    
                    
                        
                            python urllib2 - 在抓取之前等待页面完成加载/重定向?;
                        
                    
                    
                        
                            Angular 5等待不等待;
                        
                    
                    
                        
                            Angular2等待多个承诺完成;
                        
                    
                    
                        
                            Angular 6等待订阅完成;
                        
                    
                    
                        
                            Silverlight 4等待/微调控制;
                        
                    
                    
                        
                            如何使用Beautifulsoup4等待网站返回数据;
                        
                    
                    
                        
                            SAPUI5等待延迟，对象//等待.done（）函数;
                        
                    
                    
                        
                            重新启动后，wso2等待循环不起作用;
                        
                    
                    
                        
                            Python使用urllib抓取汉字乱码;
                        
                    
                    
                        
                            在Visual Studio Webtest中插入120秒等待;
                        
                    
                    
                        
                            等待网页完全加载，然后使用python请求抓取;
                        
                    
                    
                        
                            Python URLLib/URLLib2 POST;
                        
                    
                    
                        
                            从AngularJS加载的页面抓取数据;
                        
                    
                    
                        
                            使用 Python 请求抓取整个滚动加载页面;
                        
                    
                    
                        
                            Selenium click()无需等待页面加载Python;
                        
                    
                    
                        
                            等待页面使用 Selenium WebDriver for Python 加载;
                        
                    
                    
                        
                            Python URLLib/URLLib2开机自检;
                        
                    
                    
                        
                            Python：urllib / urllib2 / httplib混淆;
                        
                    
                    
                        
                            Python urllib，urllib2填写表单;
                        
                    
                    
                        
                            等待页面加载;
                        
                    
                    
                        
                            使用 urllib 抓取网页;
                        
                    
                    
                        
                            无法使用urllib和urllib2发送数据（python）;
                        
                    
                    
                        
                            在片段类成员成为仅次于Home键preSS零等待;
                        
                    
                    
                        
                            骨干抓取页面加载集合;


    
        
            Python最新文章
            
                    
                        
                            类型错误：只有长度为1的阵列可以尝试拟合指数的数据转换到Python标量;
                        
                    
                    
                        
                            bs4.FeatureNotFound：找不到一棵树建设者您所要求的功能：LXML。你需要安装一个解析器库？;
                        
                    
                    
                        
                            系列的真值是不明确的。使用a.empty，a.bool（），a.item（），a.any（）或a.all（）;
                        
                    
                    
                        
                            （unicode错误）'unicodeescape'编解码器无法解码位置2-3中的字节：truncated \UXXXXXXXX escape;
                        
                    
                    
                        
                            将pandas dataframe中的列从int转换为string;
                        
                    
                    
                        
                            Python：由实例对象调用方法：“missing 1 required positional argument：'self'”;
                        
                    
                    
                        
                            Sparksql过滤与多个条件（与where子句中选择）;
                        
                    
                    
                        
                            JSONDe codeError：期待值：1行1列（CHAR 0）;
                        
                    
                    
                        
                            Cmake不能找到Python库;
                        
                    
                    
                        
                            Python  - 将Dataframe中的所有项目转换为字符串;
                        
                    
            
        
        
            
                热门教程
            
            
                
                    
                        Java教程
                    
                
                
                    
                        Apache ANT 教程
                    
                
                
                    
                        Kali Linux教程
                    
                
                
                    
                        JavaScript教程
                    
                
                
                    
                        JavaFx教程
                    
                
                
                    
                        MFC 教程
                    
                
                
                    
                        Apache HTTP客户端教程
                    
                
                
                    
                        Microsoft Visio 教程
                    
                
            
        
        
            
                热门工具
            
            
                
                
                    
                        Java 在线工具
                    
                
                
                    
                        C(GCC) 在线工具
                    
                
                
                    
                        PHP 在线工具
                    
                
                
                    
                        C# 在线工具
                    
                
                
                    
                        Python 在线工具
                    
                
                
                    
                        MySQL 在线工具
                    
                
                
                    
                        VB.NET 在线工具
                    
                
                
                    
                        Lua 在线工具
                    
                
                
                    
                        Oracle 在线工具
                    
                
                
                    
                        C++(GCC) 在线工具
                    
                
                
                    
                        Go 在线工具
                    
                
                
                    
                        Fortran 在线工具



    
        
            登录
            关闭
        
        
            
                扫码关注1秒登录
            
            
                
            
            
                
                
            
            
                发送“验证码”获取
                |
                15天全站免登陆
            
            
        
    
    





    
		
			友情链接：
            IT屋
            Chrome插件
            谷歌浏览器插件
        
        
            IT屋
            ©2016-2022 琼ICP备2021000895号-1
            站点地图
            站点标签
            SiteMap
            <免责申明>
            本站内容来源互联网,如果侵犯您的权益请联系我们删除.