Beautifulsoup并以哈希号链接 [英] Beautifulsoup and link with a hash #

查看：68 发布时间：2021/5/15 19:18:03 python hyperlink web-scraping beautifulsoup urllib2

本文介绍了Beautifulsoup并以哈希号链接的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在将Beautifulsoup与Python结合使用.我尝试从包含哈希号的链接中获取元素.这是一个分页链接，#后面的部分是页码.

I'm using Beautifulsoup with Python. I try to get elements from a link containing a hash #. It's a pagination link, the part after the # is the page number.

这是行不通的，我理解问题是因为urllib2无法处理此问题，因为#后面的URL部分是用于客户端处理的，并且永远不会发送到服务器.

It doesn't work, I understood the problem is because urllib2 can't handle this since the part of the URL after the # is for client side handling and is never send to the server.

因此，我使用Chrome开发人员工具的网络"标签检查了真实网址，并为我提供了这一点:

So I checked the real URL using the network tab of the developer tools in Chrome and it gives me this :

服务器似乎根本不喜欢此URL，因为它返回了一个仅包含以下奇怪结果的空白页面: {"filtersBlock":"\ n \ n

It looks like the server doesn't like this URL at all because it returns me a blank page containing only this weird result : {"filtersBlock":"\n\n

所以我的问题是，有没有办法用BeautifulSoup处理这类链接?

So my question is, is there a way to handle these kind of link with BeautifulSoup ?

Beautifulsoup并以哈希号链接 [英] Beautifulsoup and link with a hash #

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Beautifulsoup并以哈希号链接 [英] Beautifulsoup and link with a hash #

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭