Python web scraping：503响应特定网站（为什么？） [英] Python web scraping : 503 Response with specific site (how come?)

查看：229 发布时间：2018/11/15 12:55:59 python python-3.x selenium ipython python-requests

本文介绍了Python web scraping：503响应特定网站（为什么？）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

尝试学习python和web抓取一些网站。看到我能学到什么。我注意到 Amazon.com 会给我回复503 除非我在我的 SESSION.get（）中使用header属性。

Experimenting with learning python and web scraping some web sites. Seeing what I can learn. I noticed Amazon.com would give me a Response 503 unless I use a header attribute in my SESSION.get().

但这不适用于 readcomiconline.to 得到回复503 无论我尝试什么。假设这与它的JavaScript预加载器有关。

But this does not work for readcomiconline.to where I get a Response 503 no matter what I try. Assuming this has to do with it's JavaScript preloader.

有任何解决方法吗？

import requests 
urlAmazon = 'http://amazon.com'
urlComics = 'http://readcomiconline.to'
headerAgent = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'}
client = requests.session()

resultOne = client.get(urlAmazon)
print(resultOne) #<Response [503]>
resultOne = client.get(urlAmazon, headers=headerAgent)
print(resultOne) #<Response [200]>

resultTwo = client.get(urlComics)
print(resultTwo) #<Response [503]>
resultTwo = client.get(urlComics, headers=headerAgent)
print(resultTwo) #<Response [503]>

尝试使用Selenium并仍然收到503错误。任何方式围绕javascript做一个适当的网页刮？

Tried using Selenium and still getting the 503 error. Any way around the javascript at all to do a proper web scrape?

import bs4, requests
from selenium import webdriver
from lxml import html

headerAgent = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'}

res = requests.get('http://readcomiconline.to/Comic/Saga/Issue-1 &readType=1',headers=headerAgent)
res.raise_for_status()

soup = bs4.BeautifulSoup(res.text, "lxml")
comicElement = soup.find('table', {'class':'listing'})

Python web scraping：503响应特定网站（为什么？） [英] Python web scraping : 503 Response with specific site (how come?)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python web scraping：503响应特定网站（为什么？） [英] Python web scraping : 503 Response with specific site (how come?)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭