TypeError:在Python和BeautifulSoup中使用split时，无法调用'NoneType'对象 [英] TypeError : 'NoneType' object not callable when using split in Python with BeautifulSoup

查看：218 发布时间：2020/9/20 7:52:29 python beautifulsoup python-requests

本文介绍了TypeError:在Python和BeautifulSoup中使用split时，无法调用'NoneType'对象的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我今天正在玩BeautifulSoup和Requests API.所以我以为我会写一个简单的刮板，它的链接深度为2(如果有意义).我正在抓取的网页中的所有链接都是相对的. (例如:<a href="/free-man-aman-sethi/books/9788184001341.htm" title="A Free Man">)因此，为了使它们成为绝对的，我想我会使用urljoin将页面URL与相对链接结合在一起.

I was playing around with the BeautifulSoup and Requests APIs today. So I thought I would write a simple scraper that would follow links to a depth of 2(if that makes sense). All the links in the webpage that i am scraping are relative. (For eg: <a href="/free-man-aman-sethi/books/9788184001341.htm" title="A Free Man">) So to make them absolute I thought I would join the page url with the relative links using urljoin.

为此，我必须首先从<a>标记中提取href值，为此，我认为我会使用split:

To do this I had to first extract the href value from the <a> tags and for that I thought I would use split:

#!/bin/python
#crawl.py
import requests
from bs4 import BeautifulSoup
from urlparse import urljoin

html_source=requests.get("http://www.flipkart.com/books")
soup=BeautifulSoup(html_source.content)
links=soup.find_all("a")
temp=links[0].split('"')

这会出现以下错误:

Traceback (most recent call last):
  File "test.py", line 10, in <module>
    temp=links[0].split('"')
TypeError: 'NoneType' object is not callable

在深入阅读文档之前，我已经意识到这可能不是实现我的目标的最佳方法，但是为什么会出现TypeError?

Having dived in before properly going through the documentation, I realize that this is probably not the best way to achieve my objective but why is there a TypeError?

推荐答案

links[0]不是字符串，而是bs4.element.Tag.当您尝试在其中查找split时，它会发挥作用并尝试找到名为split的子元素，但没有.你叫那无.

links[0] is not a string, it's a bs4.element.Tag. When you try to look up split in it, it does its magic and tries to find a subelement named split, but there is none. You are calling that None.

In [10]: l = links[0]

In [11]: type(l)
Out[11]: bs4.element.Tag

In [17]: print l.split
None

In [18]: None()   # :)

TypeError: 'NoneType' object is not callable

使用索引查找HTML属性:

Use indexing to look up HTML attributes:

In [21]: links[0]['href']
Out[21]: '/?ref=1591d2c3-5613-4592-a245-ca34cbd29008&_pop=brdcrumb'

或get如果存在不存在的属性的危险:

Or get if there is a danger of nonexisting attributes:

In [24]: links[0].get('href')
Out[24]: '/?ref=1591d2c3-5613-4592-a245-ca34cbd29008&_pop=brdcrumb'


In [26]: print links[0].get('wharrgarbl')
None

In [27]: print links[0]['wharrgarbl']

KeyError: 'wharrgarbl'

这篇关于TypeError:在Python和BeautifulSoup中使用split时，无法调用'NoneType'对象的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

TypeError:在Python和BeautifulSoup中使用split时，无法调用'NoneType'对象 [英] TypeError : 'NoneType' object not callable when using split in Python with BeautifulSoup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

TypeError:在Python和BeautifulSoup中使用split时，无法调用'NoneType'对象 [英] TypeError : &#39;NoneType&#39; object not callable when using split in Python with BeautifulSoup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

TypeError:在Python和BeautifulSoup中使用split时，无法调用'NoneType'对象 [英] TypeError : 'NoneType' object not callable when using split in Python with BeautifulSoup

登录关闭