无论如何要抓取重定向的链接? [英] Anyway to scrape a link that redirects?

查看：64 发布时间：2021/4/15 19:04:55 python parsing web-scraping beautifulsoup lxml

本文介绍了无论如何要抓取重定向的链接?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

无论如何，我是否可以使python单击诸如bit.ly链接之类的链接，然后抓取所得到的链接?当我抓取某个页面时，我唯一可以抓取的链接是一个重定向的链接，重定向到的链接就是我需要的信息所在的位置.

Is there anyway that I can make python click a link such as a bit.ly link and then scrape the resulting link? When I am scraping a certain page, the only link I can scrape is a link that redirects, where it redirects to is where the information I need is located.

推荐答案

重定向共有3种类型

HTTP -作为响应标头中的信息(使用代码 301 ， 302 ，3xx)
HTML -作为HTML中的标记< meta> (维基百科:元刷新)
JavaScript -作为 window.location = new_url

HTTP - as information in response headers (with code 301, 302, 3xx)
HTML - as tag <meta> in HTML (wikipedia: Meta refresh)
JavaScript - as code like window.location = new_url

请求执行 HTTP 重定向并将所有URL保留在 r.history

requests execute HTTP redirections and keep all urls in r.history

import requests

r = requests.get('http://' + 'bit.ly/english-4-it')

print(r.history)
print(r.url)

结果:

[<Response [301]>, <Response [301]>]
http://helion.pl/ksiazki/english-4-it-praktyczny-kurs-jezyka-angielskiego-dla-specjalistow-it-i-nie-tylko-beata-blaszczyk,anginf.htm

顺便说一句: SO不允许在文本中放置一点点链接，所以我使用了串联.

BTW: SO doesn't let put bitly link in text so I used concatenation.

这篇关于无论如何要抓取重定向的链接?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

无论如何要抓取重定向的链接? [英] Anyway to scrape a link that redirects?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

无论如何要抓取重定向的链接? [英] Anyway to scrape a link that redirects?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭