如何从一个站点抓取多个页面 [英] how to scrape multiple pages from one site

查看：54 发布时间：2021/7/16 21:44:44 python scrape

本文介绍了如何从一个站点抓取多个页面的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从一个站点抓取多个页面.这样的模式:

I want to scrap multiple pages from one site.the pattern like this：

https://www.example.com/S1-3-1.html https://www.example.com/S1-3-2.html https://www.example.com/S1-3-3.html https://www.example.com/S1-3-4.html https://www.example.com/S1-3-5.html.

我尝试了三种方法一次抓取所有这些页面，但每种方法都只抓取第一页.我在下面展示了代码，任何人都可以检查并告诉我问题所在，我们将不胜感激.

I tried three method to scrape all of these pages once, but every method only scrape the first page. I show the code below, and anyone can check and tell me what is the problem will be highly appreciated.

 ===============method 1====================
    import requests  
    for i in range(5):      # Number of pages plus one 
        url = "https://www.example.com/S1-3-{}.html".format(i)
        r = requests.get(url)
    from bs4 import BeautifulSoup  
    soup = BeautifulSoup(r.text, 'html.parser')  
    results = soup.find_all('div', attrs={'class':'product-item item-template-0 alternative'})
    ===============method 2=============
    import urllib2,sys
    from bs4 import BeautifulSoup
    for numb in ('1', '5'):
        address = ('https://www.example.com/S1-3-' + numb + '.html')
    html = urllib2.urlopen(address).read()
    soup = BeautifulSoup(html,'html.parser')
    results = soup.find_all('div', attrs={'class':'product-item item-template-0 alternative'})
    =============method 3==============
    import requests 
    from bs4 import BeautifulSoup  
    url = 'https://www.example.com/S1-3-1.html'
    for round in range(5):
        res = requests.get(url)
        soup = BeautifulSoup(res.text,'html.parser')
        results = soup.find_all('div', attrs={'class':'product-item item-template-0 alternative'})
        paging = soup.select('div.paging a')
        next_url = 'https://www.example.com/'+paging[-1]['href'] # paging[-1]['href'] is next page button on the page 
        url = next_url

我检查了一些答案并进行了检查，但不是循环问题，请检查下图，这只是第一页结果.真是气死我几天了请看照片:仅第一页结果，结果图2

I checked some answers and checked, but it is not loop problem, please check image shown below,it is only first page results. it is really me annoyed several days please see photo:only first page results, results picture 2

如何从一个站点抓取多个页面 [英] how to scrape multiple pages from one site

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何从一个站点抓取多个页面 [英] how to scrape multiple pages from one site

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭