Python清单物件没有属性错误 [英] Python list object has no attribute error

查看:71
本文介绍了Python清单物件没有属性错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Python的新手,我正在尝试编写一个网站刮板以获取来自subreddit的链接,然后可以将其传递给另一个类,以便稍后从imagur自动下载图像.

I am new to Python and I am trying to write a website scraper to get links from subreddits, which I can then pass to another class later on for automatic download of images from imagur.

在此代码段中,我只是尝试阅读subreddit并从hrefs中刮取任何imagur html,但出现以下错误:

In this code snippet, I am just trying to read the subreddit and scrape any imagur htmls from hrefs, but I get the following error:

AttributeError: 'list' object has no attribute 'timeout'

关于为什么会发生这种情况的任何想法吗?这是代码:

Any idea as to why this might be happening? Here is the code:

from bs4 import BeautifulSoup
from urllib2 import urlopen
import sys
from urlparse import urljoin

def get_category_links(base_url):
    url = base_url
    html = urlopen(url)
    soup = BeautifulSoup(html)
    posts = soup('a',{'class':'title may-blank loggedin outbound'})
    #get the links with the class "title may-blank "
    #which is how reddit defines posts
    for post in posts:
        print post.contents[0]
        #print the post's title

        if post['href'][:4] =='http':
            print post['href']
        else:
            print urljoin(url,post['href'])
        #print the url.  
        #if the url is a relative url,
        #print the absolute url.   


get_category_links(sys.argv)

推荐答案

看看如何调用该函数:

get_category_links(sys.argv)

sys.argv 这是 a脚本参数列表,其中第一项是脚本名称本身.这意味着您的base_url参数值是导致失败的urlopen的列表:

sys.argv here is a list of script arguments where the first item is the script name itself. This means that your base_url argument value is a list which leads to failing urlopen:

>>> from urllib2 import urlopen
>>> urlopen(["I am", "a list"])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
           │           │    │     └ <object object at 0x105e2c120>
           │           │    └ None
           │           └ ['I am', 'a list']
           └ <urllib2.OpenerDirector instance at 0x105edc638>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 422, in open
    req.timeout = timeout
    │             └ <object object at 0x105e2c120>
    └ ['I am', 'a list']
AttributeError: 'list' object has no attribute 'timeout'

您打算从sys.argv获取第二个参数并将其传递给get_category_links:

You meant to get the second argument from sys.argv and pass it to get_category_links:

get_category_links(sys.argv[1])


有趣的是,在这种情况下,错误是多么的神秘和难以理解.这来自"url opener"在Python中的工作方式2.7 .如果url值(第一个参数)不是字符串,则假定它是Request实例,并尝试在其上设置timeout值:


It's interesting though, how cryptic and difficult to understand the error in this case is. This is coming from the way the "url opener" works in Python 2.7. If, the url value (the first argument) is not a string, it assumes it is a Request instance and tries to set a timeout value on it:

def open(self, fullurl, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    # accept a URL or a Request object
    if isinstance(fullurl, basestring):
        req = Request(fullurl, data)
    else:
        req = fullurl
        if data is not None:
            req.add_data(data)

    req.timeout = timeout  # <-- FAILS HERE

请注意,行为实际上并没有在最新的稳定版3.6中也进行了更改.

这篇关于Python清单物件没有属性错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆