Python请求模块在获取请求期间不返回完整页面 [英] Python requests module doesn't return full page during get request

查看:41
本文介绍了Python请求模块在获取请求期间不返回完整页面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我向这个 url 发出 get 请求时:http://www.waterwaysguide.org.au/waterwaysguide/access-point/4980/partial 使用浏览器返回完整的 html 页面.但是,当我使用 python requests 模块发出 GET 请求时,只返回了 html 的一部分,并且缺少核心内容.

如何更改代码以便获取丢失的数据?

这是我使用的代码;

导入请求def get_data(point_num):base_url = 'http://www.waterwaysguide.org.au/waterwaysguide/access-point/{}/partial'r = requests.get(base_url)html_content = r.text打印(html_content)获取数据(4980)

运行代码的结果如下所示.div class="view view-waterway-access-point-page... 中的内容丢失.

<div class="modal-header"><button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">&times;</span><h4 class="modal-title">兴趣点详情 </h4>

<div class="modal-body"><div class="view view-waterway-access-point-page view-id-waterway_access_point_page view-display-id-page view-dom-id-c855bf9afdfe945979f96b2301d55784">

<div class="modal-footer"><button type="button" id="closeRemoteModal" class="btn btn-action" data-dismiss="modal">关闭</button>

解决方案

以下方法显示 div class="view view-waterway-access-point-page...

<预><代码>>>>从 urllib.request 导入请求,urlopen>>>从 bs4 导入 BeautifulSoup>>>url = 'http://www.waterwaysguide.org.au/waterwaysguide/access-点/4980/部分'>>>req = Request(url,headers={'User-Agent': 'Mozilla/5.0'})>>>网页 = urlopen(req).read()>>>打印(网页)

When I make a get request to this url: http://www.waterwaysguide.org.au/waterwaysguide/access-point/4980/partial with a browser a full html page is returned. However when I make a GET request with the python requests module only a part of the html is returned and the core content is missing.

How do I change my code so that I can get the data that is missing?

This is the code I am using;

import requests
def get_data(point_num):
    base_url = 'http://www.waterwaysguide.org.au/waterwaysguide/access-point/{}/partial'
    r = requests.get(base_url)
    html_content = r.text
    print(html_content)
get_data(4980)

The result of running the code is shown below. The content inside the div class="view view-waterway-access-point-page... is missing.

<div>
  <div class="modal-header">
    <button type="button" class="close" data-dismiss="modal" aria-label="Close">
      <span aria-hidden="true">&times;</span>
    </button>
    <h4 class="modal-title">
        Point of Interest detail    </h4>

  </div>
  <div class="modal-body">
    <div class="view view-waterway-access-point-page view-id-waterway_access_point_page view-display-id-page view-dom-id-c855bf9afdfe945979f96b2301d55784">
        
  
  
  
  
  
  
  
  
</div>  </div>
  <div class="modal-footer">
    
    <button type="button" id="closeRemoteModal" class="btn btn-action" data-dismiss="modal">Close</button>
  </div>
</div>

解决方案

The following approach displays the missing content inside the div class="view view-waterway-access-point-page...

>>> from urllib.request import Request, urlopen
>>> from bs4 import BeautifulSoup
>>> url = 'http://www.waterwaysguide.org.au/waterwaysguide/access-
point/4980/partial'
>>> req = Request(url,headers={'User-Agent': 'Mozilla/5.0'})
>>> webpage = urlopen(req).read()
>>> print(webpage)

这篇关于Python请求模块在获取请求期间不返回完整页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
Python最新文章
热门教程
热门工具
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆