“列表"对象没有属性“超时" [英] 'list' object has no attribute 'timeout'

查看:86
本文介绍了“列表"对象没有属性“超时"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用urllib.request.urlopen从页面下载Pdfs,但返回错误:'list'对象没有属性'timeout'

I am trying to download Pdfs using urllib.request.urlopen from a page but it returns error:'list' object has no attribute 'timeout'

def get_hansard_data(page_url):
    #Read base_url into Beautiful soup Object
    html = urllib.request.urlopen(page_url).read()
    soup = BeautifulSoup(html, "html.parser")
    #grab <div class="itemContainer"> that hold links and dates to all hansard pdfs
    hansard_menu = soup.find_all("div","itemContainer")

    #Get all hansards
    #write to a tsv file
    with open("hansards.tsv","a") as f:
        fieldnames = ("date","hansard_url")
        output = csv.writer(f, delimiter="\t")

        for div in hansard_menu:
            hansard_link = [HANSARD_URL + div.a["href"]]
            hansard_date = div.find("h3", "catItemTitle").string

            #download

            with urllib.request.urlopen(hansard_link) as response:
                data = response.read()
                r = open("/Users/Parliament Hansards/"+hansard_date +".txt","wb")
                r.write(data)
                r.close()

            print(hansard_date)
            print(hansard_link)
            output.writerow([hansard_date,hansard_link])
        print ("Done Writing File")

请帮助.

推荐答案

有点晚,但可能对其他人还是有帮助的(如果不是主题入门者).我通过解决相同的问题找到了解决方案.

A bit late, but might still be helpful to someone else (if not for topic starter). I found the solution by solving the same problem.

问题是page_url(在您的情况下)是一个列表,而不是字符串.原因可能是page_url来自argparse.parse_args()(至少在我的情况下是如此). 进行page_url[0]应该可以,但是在def get_hansard_data(page_url)函数中执行该操作并不好.最好是检查参数的类型,如果类型不匹配,则向函数调用者返回适当的错误.

The problem was that page_url (in your case) was a list, rather than a string. The reason for that is mos likely that page_url comes from argparse.parse_args() (at least it was so in my case). Doing page_url[0] should work but it is not nice to do that inside the def get_hansard_data(page_url) function. Better would be to check the type of the parameter and return an appropriate error to the function caller, if the type does not match.

可以通过调用type(page_url)并比较结果来检查参数的类型,例如:typen("") == type(page_url).我相信可能会有更优雅的方法来做到这一点,但这超出了这个特定问题的范围.

The type of an argument could be checked by calling type(page_url) and comparing the result like for example: typen("") == type(page_url). I am sure there might be more elegant way to do that, but it is out of the scope of this particular question.

这篇关于“列表"对象没有属性“超时"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆