如何使脚本在迭代中等待,直到重新建立Internet连接? [英] How to make a script wait within an iteration until the Internet connection is reestablished?

查看:88
本文介绍了如何使脚本在迭代中等待,直到重新建立Internet连接?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在for循环中有一个抓取代码,但是要花几个小时才能完成,并且当我的Internet连接断开时,程序会停止.我(我想)需要的是在刮板开始时出现的一个条件,该条件告诉Python继续尝试这一点. 我尝试使用 此处 :

for w in wordlist:

#some text processing, works fine, returns 'textresult'

    if textresult == '___':  #if there's nothing in the offline resources
        bufferlist = list()
        str1=str()
        mlist=list()  # I use these in scraping

        br = mechanize.Browser()

        tried=0
        while True:
            try:
                br.open("http://the_site_to_scrape/")

                # scraping, with several ifs. Each 'for w' iteration results with scrape_result string.


            except (mechanize.HTTPError, mechanize.URLError) as e:
                tried += 1
                if isinstance(e,mechanize.HTTPError):
                    print e.code
                else:
                    print e.reason.args
            if tried > 4:
                    exit()
                    time.sleep(120)
                    continue
            break

我在线时可以工作.当连接断开时,Python编写403代码并从wordlist跳过该单词,移至下一个并执行相同的操作.如何告诉Python在迭代中等待连接?

编辑:如果您可以编写至少一些必要的命令并告诉我它们在代码中的位置,我将不胜感激,因为我从未处理过异常循环. /p>

编辑-解决方案,我应用了Abhishek Jebaraj的修改后的解决方案.我刚刚添加了一个非常简单的异常处理命令:

except:
    print "connection interrupted"
    time.sleep(30)

此外,Jebaraj的getcode命令将引发错误.在r.getcode之前,我使用了以下方法:

import urllib

r = urllib.urlopen("http: the site ")

此问题的最佳答案对我也有帮助.

解决方案

在其中编写另一个while循环,该循环将继续尝试连接到Internet.

只有当它收到状态码200时它才会中断,然后您才能继续执行程序.

喜欢的种类

retry = True
while retry:
    try:
        r = br.open(//your site)
        if r.getcode()/10==20:
            retry = False
    except:
          // code to handle any exception

// rest of your code

I have a scraping code within a for loop, but it would take several hours to complete, and the program stops when my Internet connection breaks. What I (think I) need is a condition at the beginning of the scraper that tells Python to keep trying at that point. I tried to use the answer from here:

for w in wordlist:

#some text processing, works fine, returns 'textresult'

    if textresult == '___':  #if there's nothing in the offline resources
        bufferlist = list()
        str1=str()
        mlist=list()  # I use these in scraping

        br = mechanize.Browser()

        tried=0
        while True:
            try:
                br.open("http://the_site_to_scrape/")

                # scraping, with several ifs. Each 'for w' iteration results with scrape_result string.


            except (mechanize.HTTPError, mechanize.URLError) as e:
                tried += 1
                if isinstance(e,mechanize.HTTPError):
                    print e.code
                else:
                    print e.reason.args
            if tried > 4:
                    exit()
                    time.sleep(120)
                    continue
            break

Works while I'm online. When the connection breaks, Python writes the 403 code and skips that word from wordlist, moves on to the next and does the same. How can I tell Python to wait for connection within the iteration?

EDIT: I would appreciate it if you could write at least some of the necessary commands and tell me where they should be placed in my code, because I've never dealt with exception loops.

EDIT - SOLUTION I applied Abhishek Jebaraj's modified solution. I just added a very simple exception handling command:

except:
    print "connection interrupted"
    time.sleep(30)

Also, Jebaraj's getcode command will raise an error. Before r.getcode, I used this:

import urllib

r = urllib.urlopen("http: the site ")

The top answer to this question helped me as well.

解决方案

Write another while loop inside which will keep trying to connect to the internet.

It will break only when it receives status code of 200 and then you can continue with your program.

Kind of like

retry = True
while retry:
    try:
        r = br.open(//your site)
        if r.getcode()/10==20:
            retry = False
    except:
          // code to handle any exception

// rest of your code

这篇关于如何使脚本在迭代中等待,直到重新建立Internet连接?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆