使用请求和 BeautifulSoup 下载文件 [英] Download files using requests and BeautifulSoup

查看：32 发布时间：2021/12/23 20:40:11 python download beautifulsoup python-requests

本文介绍了使用请求和 BeautifulSoup 下载文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试从下载一堆 pdf 文件在这里使用请求和beautifulsoup4.这是我的代码:

I'm trying download a bunch of pdf files from here using requests and beautifulsoup4. This is my code:

import requests
from bs4 import BeautifulSoup as bs

_ANO = '2013/'
_MES = '01/'
_MATERIAS = 'matematica/'
_CONTEXT = 'wp-content/uploads/' + _ANO + _MES
_URL = 'http://www.desconversa.com.br/' + _MATERIAS + _CONTEXT

r = requests.get(_URL)
soup = bs(r.text)

for i, link in enumerate(soup.findAll('a')):
    _FULLURL = _URL + link.get('href')

    for x in range(i):
        output = open('file[%d].pdf' % x, 'wb')
        output.write(_FULLURL.read())
        output.close()

我收到 AttributeError: 'str' object has no attribute 'read'.

好的，我知道，但是...我如何从生成的 URL 下载?

Ok, I know that, but... how can I download from that URL generated?

推荐答案

这会将页面中的所有文件及其原始文件名写入 pdfs/ 目录.

This will write all the files from the page with their original filenames into a pdfs/ directory.

import requests
from bs4 import BeautifulSoup as bs
import urllib2


_ANO = '2013/'
_MES = '01/'
_MATERIAS = 'matematica/'
_CONTEXT = 'wp-content/uploads/' + _ANO + _MES
_URL = 'http://www.desconversa.com.br/' + _MATERIAS + _CONTEXT

# functional
r = requests.get(_URL)
soup = bs(r.text)
urls = []
names = []
for i, link in enumerate(soup.findAll('a')):
    _FULLURL = _URL + link.get('href')
    if _FULLURL.endswith('.pdf'):
        urls.append(_FULLURL)
        names.append(soup.select('a')[i].attrs['href'])

names_urls = zip(names, urls)

for name, url in names_urls:
    print url
    rq = urllib2.Request(url)
    res = urllib2.urlopen(rq)
    pdf = open("pdfs/" + name, 'wb')
    pdf.write(res.read())
    pdf.close()

这篇关于使用请求和 BeautifulSoup 下载文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用请求和 BeautifulSoup 下载文件 [英] Download files using requests and BeautifulSoup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用请求和 BeautifulSoup 下载文件 [英] Download files using requests and BeautifulSoup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭