在与美丽的汤某些链接 [英] Following certain links with beautiful soup

查看：182 发布时间：2016/8/5 19:19:31 python-3.x web-scraping beautifulsoup web-crawler

本文介绍了在与美丽的汤某些链接的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直有很多麻烦这个问题，我想我理解的工作，但后来我的头现在在它的凹痕从撞它放在桌子上。

I've been having a lot of trouble with this problem, and I think I understand the work, but then my head now has a dent in it from banging it on the desk.

我需要做的就是通过与美丽的汤网页擦伤一个程序，但它然后获取一定的联系（任何地方从3或一页20链接），然后转到该3日（或20日，或任何数字）的链接，并试图找到该页第三环节，一遍又一遍，对于倍数额不详（IM保持它20岁以下的解释的目的。我需要找到的最后一个（第三），但是许多搜索后链接。

What I need to do is make a program that scrapes through a webpage with beautiful soup, but it then gets a certain link (anywhere from the 3rd or 20th link down the page) then goes to that 3rd(or 20th, or whatever number) link and tries to find the 3rd link from that page, over and over, for an unspecified amount of times (im keeping it under 20 for explanation purposes. I need to find the last (3rd) link after however many searches.

我有我的计划，但我不能让过去的第二次迭代！我没有找到一个办法了几个小时后，并得到我的答案，但它是一个无限循环，而这不会帮助我学习。

I've got my program, but I can't get past the 2nd iteration! I did find a way a couple hours into it and got my answer, but it was an infinite loop, and that's not going to help me learn.

可以说，这是我必须做的：

lets say this is what I have to do:

查找第7位（第一页7链接）的链接。按照该链接。重复此过程5次。答案是从您检索链接姓。

我有一个方法来检索的名字，只是遇到了问题搞清楚一个循环！

I've got a way to retrieve the name, just having trouble figuring out a loop!

我也有点过分热心试图找到另一个职位这个一小时。还有很多类似的，但不与，我发现这个确切的问题。谢谢你的时间。这里是我到目前为止所。

I also was a little overzealous trying to find another post about this for an hour. There are many similar, but not with this exact problem that I have found. Thanks for your time. Here is what I have so far.

from urllib.request import urlopen
from bs4 import BeautifulSoup

#first page url
url = 'insertwebsitehere.com' 
html = urlopen(url).read()
soup = BeautifulSoup(html)

# Retrieve all of the anchor tags
tags = soup('a')

taglist= []
count = 0

for tag in tags:
    name = tag.contents[0]
    newtag = tag.get('href',None)
    #print (newtag)
    # add count? count += 1 , then do something when it reaches a certain count?
    #taglist.append(newtag), this method didnt really work.

我是一个新的codeR，所以我想这样做没有的先进技术，我不一定需要的答案，只是帮助。

I am a new coder, so I'm trying to do this without advanced techniques, and I don't necessarily need the answer, just help.

在与美丽的汤某些链接 [英] Following certain links with beautiful soup

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

在与美丽的汤某些链接 [英] Following certain links with beautiful soup

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭