如果在javascript中返回，如何刮取搜索结果（使用python） [英] how to scrape search results if returned in javascript (using python)

查看：85 发布时间：2019/6/8 19:51:06 javascript python web-scraping

本文介绍了如果在javascript中返回，如何刮取搜索结果（使用python）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想要抓取的网站使用JavaScript填充返回。

The site I want to scrape populates returns using JavaScript.

我可以简单地以某种方式调用脚本并使用其结果吗？（当然，没有分页。）我不想运行整个过程来抓取生成的格式化HTML，但原始源是空白的。

Can I simply call the script somehow and work with its results? (Then without pagination, of course.) I don't want to run the entire thing to scrape the resulting formatted HTML, but the raw source is blank.

有一个看： http://kozbeszerzes.ceu.hu/searchresults.xhtml?q = 1998& page = 0

回报的来源只是

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/templates/base_template.xsl"?>
<content>
  <head>
    <SCRIPT type="text/javascript" src="/js/searchResultsView.js"></SCRIPT>    
  </head>
    <whitebox>
    <div id = "hits"></div>  
  </whitebox>
</content>

我更喜欢简单的Python工具。

I would prefer simple Python tools.

推荐答案

我下载了 Selenium 和< a href =https://code.google.com/p/selenium/wiki/ChromeDriver =nofollow> ChromeDriver 。

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('http://kozbeszerzes.ceu.hu/searchresults.xhtml?q=1998&page=0')

for e in driver.find_elements_by_class_name('result'):
    link = e.find_element_by_tag_name('a')
    print(link.text.encode('ascii', 'ignore'), link.get_attribute('href').encode('ascii', 'ignore'))

driver.quit()

如果您使用的是Chrome，则可以使用F12检查页面属性，这非常有用。

If you're using Chrome, you can inspect the page attributes using F12, which is pretty useful.

这篇关于如果在javascript中返回，如何刮取搜索结果（使用python）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如果在javascript中返回，如何刮取搜索结果（使用python） [英] how to scrape search results if returned in javascript (using python)

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

如果在javascript中返回，如何刮取搜索结果（使用python） [英] how to scrape search results if returned in javascript (using python)

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭