requests.history 未显示所有重定向 [英] requests.history not showing all redirects

查看:48
本文介绍了requests.history 未显示所有重定向的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取某些 Wikipedia 页面的重定向,但发生了一些令我感到好奇的事情.

如果我做:

<预><代码>>>>request = requests.get("https://en.wikipedia.org/wiki/barcelona", allow_redirects=True)>>>请求.urlu'https://en.wikipedia.org/wiki/Barcelona'>>>请求历史[<响应[301]>]

如您所见,重定向是正确的,我在浏览器中的网址与 Python 中的网址相同.

但如果我尝试:

<预><代码>>>>request = requests.get("https://en.wikipedia.org/wiki/Yardymli_Rayon", allow_redirects=True)>>>请求.urlu'https://en.wikipedia.org/wiki/Yardymli_Rayon'>>>请求历史[]

在浏览器中,我看到 URL 已更改为:https://en.wikipedia.org/wiki/Yardymli_District

有人知道怎么解决吗?

解决方案

Requests 不显示重定向,因为您实际上并未在 HTTP 意义上被重定向.维基百科做了一些 JavaScript 技巧(可能是 HTML5 历史修改和 pushState)来更改地址栏中显示的地址,但这当然不适用于请求.

换句话说,requests 和您的浏览器都是正确的:requests 显示的是您实际请求的 URL(以及实际提供的 Wikipedia),而您浏览器的地址栏是显示正确"的规范网址.

如果您想从脚本中找出正确"的 URL,或者通过 Wikipedia 获取文章,您可以解析响应并查找 <link rel="canonical"> 标记API.

I'm trying to get the redirects of some Wikipedia pages, and it's happening something curious to me.

If i make:

>>> request = requests.get("https://en.wikipedia.org/wiki/barcelona", allow_redirects=True)
>>> request.url
u'https://en.wikipedia.org/wiki/Barcelona'
>>> request.history
[<Response [301]>]

As you can see, the redirection is correct and I have same url in browser that in Python.

But if I try:

>>> request = requests.get("https://en.wikipedia.org/wiki/Yardymli_Rayon", allow_redirects=True)
>>> request.url
u'https://en.wikipedia.org/wiki/Yardymli_Rayon'
>>> request.history
[]

And in the browser I see that the URL has changed to: https://en.wikipedia.org/wiki/Yardymli_District

Anyone knows how to solve it?

解决方案

Requests doesn't show the redirect because you're not actually being redirected in the HTTP sense. Wikipedia does some JavaScript trickery (probably HTML5 history modification and pushState) to change the address that's shown in the address bar, but that doesn't apply to Requests, of course.

In other words, both requests and your browser are correct: requests is showing the URL you actually requested (and Wikipedia actually served), while your browser's address bar is showing the 'proper', canonical URL.

You could parse the response and look for the <link rel="canonical"> tag if you want to find out the 'proper' URL from your script, or fetch articles over Wikipedia's API instead.

这篇关于requests.history 未显示所有重定向的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆