BeautifulSoup刮痧：加载div的内容，而不是 [英] BeautifulSoup Scraping: loading div instead of the content

查看：196 发布时间：2016/8/5 19:20:37 javascript python html web-scraping beautifulsoup

本文介绍了BeautifulSoup刮痧：加载div的内容，而不是的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

菜鸟在这里。
我试图从本网站刮的搜索结果： http://www.mastersportal.eu/search/?q=di-4|lv-master&order=relevance

我使用Python的BeautifulSoup

I'm using python's BeautifulSoup

import csv
import requests
from BeautifulSoup import BeautifulSoup

for numb in ('0', '69'):
        url = ('http://www.mastersportal.eu/search/?q=ci-30,11,10,3,4,8,9,14,15,16,17,34,1,19|di-4|lv-master|rv-1&start=' + numb + '0&order=tuition_eea&direction=asc')
        response = requests.get(url)
        html = response.content

        soup = BeautifulSoup(html)
        table = soup.find('div', attrs={'id': 'StudySearchResults'})

        lista = []
        for i in table.findAll('h3'):
            lista.append(h3.string)
print(table.prettify())

我想获得干净的数据与对掌握的基本信息（现在只是名称）。
我在这里使用的URL是网站和环路上的一个过滤的研究去与网页应该罚款。

I want to get clean data with the basic information about the Master (for now just the name). The URL I'm using here is for a filtered research on the website and the loop to go on with pages should be fine.

然而，结果是：

<div id="StudySearchResults">
  <div style="display:none" id="TrackingSearchValue" class="TrackingSearchValue" data-search=""></div>
  <div style="display:none" id="SearchViewEvent" class="TrackingEvent TrackingNoLocation" data-type="srch" data-action="view" data-id=""></div>
  <div id="StudySearchResultsStudies" class="TrackingLinkedList" data-start="" data-list-type="study" data-type="rslts">
    <!-- Wait pane, just here to make sure there is no white page -->
    <div id="WaitPane" class="WaitPane">
      <img src="http://www.mastersportal.eu/Modules/Results/Resources/Throbber.gif" />
      <span>Loading search results...</span>
    </div>
  </div>
</div>

为什么没有显示的内容，但仅装载单吗？阅读周围，我觉得它有什么做的网站使用JavaScript处理数据的方式，并为Python像一个AJAX请求存在吗？（或任何其他方式来告诉刷屏等待网页加载？）

Why isn't the content displaying but only the loading div? Reading around I feel it has something to do with the way the website handles data with JavaScript, does something like an AJAX request exist for Python? (or any other way to tell the scraper to wait for the page to load?)

BeautifulSoup刮痧：加载div的内容，而不是 [英] BeautifulSoup Scraping: loading div instead of the content

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

BeautifulSoup刮痧：加载div的内容，而不是 [英] BeautifulSoup Scraping: loading div instead of the content

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭