使用python从网站下载书籍 [英] Downloading Books from website with python

查看:87
本文介绍了使用python从网站下载书籍的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从网站上下载书籍,几乎我的代码运行流畅,但是当我尝试在PC上打开pdf书籍时.Adobe Acrobat Reader生成的错误,它是不受支持的文件类型.

I'm downloading books from the website, and almost my code runs smoothly, but when I try to open the pdf Book on my PC. An error generated by Adobe Acrobat Reader that this is not supported file type.

这是Book格式的图片,我确定我的代码需要更正,因为网站上该书的格式不同于通常的PDF文件.

Here is the image of the Book formate, and I'm sure my code needs to be a correction because the formate of the book on the website is different from normally PDF Files.

代码:

import requests
from bs4 import BeautifulSoup
url = 'https://global.oup.com/education/support-learning-anywhere/key-resources-online/?region=international&utm_campaign=learninganywhere&utm_source=umbraco&utm_medium=display&utm_content=support_learning_key_resources&utm_team=int#Primary'

response = requests.get(url)
soup     = BeautifulSoup(response.content, 'html.parser')
table_data = soup.find_all('td')

books_url_list = []
for link in table_data:
    books_url = link.find('a')['href']
    books_url_list.append(books_url+'.pdf')
    
book = books_url_list[1]
book_response = requests.get(book)

with open('books.pdf', 'wb') as f:
    f.write(book_response.content)

`

推荐答案

好吧,我检查了网站中的元素,然后找不到".pdf"文件.我们可以使用以下链接检查一个书页: https://zh-CN.calameo.com/read/000777721d10096b9e9ca?authid=gWc48kAQQoD0&region=international

Well, I inspected element from website, then I find no '.pdf' files. We can inspect one book page using following link: https://en.calameo.com/read/000777721d10096b9e9ca?authid=gWc48kAQQoD0&region=international

检查元素后,我发现不是pdf.这只是页面中的图像.

After inspecting the element, I find is not pdf. It's just an image in the page.

https://p.calameoassets.com/200406174654-2bfa9441783e162c8da42a712feda3e2/e

https://p.calameoassets.com/200406174654-2bfa9441783e162c8da42a7122.

....

https://p.calameoassets.com/200406174654-2bfa9441783e162c8da42a712feda3e2/e

以此类推.

因此,您可以编写代码来下载此图像.

So, you can write a code to download this image.

这篇关于使用python从网站下载书籍的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆