仅从网页上的行拉链接及其链接文本,并使用python插入字典 [英] pull links and their link text only from lines on web page and insert into a dictionary using python

查看:89
本文介绍了仅从网页上的行拉链接及其链接文本,并使用python插入字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图逐行从网页中提取链接及其文本,然后插入文本并链接到字典中。没有使用美丽的汤或正则表达式。



i不断收到此错误:



错误:



I am trying to pull only the links and their text from a webpage line by line and insert text and link into a dictionary. Without using beautiful soup or a regex.

i keep getting this error:

error:

Traceback (most recent call last):
File "F:/Homework7-2.py", line 13, in <module>
link2 = link1.split("href=")[1]
IndexError: list index out of range





代码:





code:

import urllib.request
url = "http://www.facebook.com" 
page = urllib.request.urlopen(url)
mylinks = {}
links = page.readline().decode('utf-8')


for items in links:
  links = page.readline().decode('utf-8')
  if "a href=" in links:
     links = page.readline().decode('utf-8')
     link1 = links.split(">")[0]
     link2 = link1.split("href=")[1]
     mylinks = link2
     print(mylinks)

推荐答案

IndexError: list index out of range



消息告诉您索引值大于列表中的项目数。您不应该对命令的结果做出假设,而是先检查。如果返回的列表只有一个项目,那么您需要为此编码。


The message is telling you that the index value is greater than the number of items in the list. You should not make assumptions about the results of commands, but check first. If the returned list has only one item then you need to code for that.


这篇关于仅从网页上的行拉链接及其链接文本,并使用python插入字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆