我的代码中的列表/字典错误 [英] List/Dictionary error in my code

查看:86
本文介绍了我的代码中的列表/字典错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个Web爬网程序,该爬网程序以嵌套列表的形式返回货币兑换值,并且我试图编写一部分代码,该代码将在该列表中搜索给定名称,并提取与之关联的货币值数据.

I have written a web crawler that returns currency exchange values as a nested list and i am trying to write a part of the code that will search through this list for a given name and extract the currency value data associated with it.

from urllib.request import urlopen
def find_element(line, s_pattern, e_pattern, position=0):
    shift = len(s_pattern)
    start = line.find(s_pattern, position) + shift
    position = start
    end = line.find(e_pattern, position)
    return (line[start:end], position)

def fetch(url):
    html = urlopen(url)
    records = []
    i = 0
    for line in html.readlines():
        line = line.decode()
        if "<tr><td>" not in line:
             continue  # skip if line don't contain rows
        if "Currency" in line:
             continue  # skip header

        start = "<tr><td>"
        end = "</td>"
        element, start_pos = find_element(line, start, end)
        records.append([element])
        start = "<td>"
        values = []
        for x in range(2):
            element, start_pos = find_element(line, start, end, start_pos)
            values.append(element)
        records[i].append(values)
        i = i + 1
    return(records)
def findCurrencyValue(records, currency_name):
    l = [[(records)]]
    d = dict(l)
    d[currency_name]
    return(d)
def main():
    url = "https://www.cs.purdue.edu/homes/jind/exchangerate.html"
    records = fetch(url)
    findCurrencyValue(records, "Argentine Peso")
    print(findCurrencyValue)
    print("currency exchange information is\n", records)
main()  

但是我得到了错误

ValueError: dictionary update sequence element #0 has length 1; 2 is required

推荐答案

HTML绝对不能这样解析.这是一个相同的示例,但具有请求 + lxml (行少且准确):

HTML should never be parsed like that. Here is an example of the same but with requests + lxml (has fewer lines and is accurate):

import requests
from lxml import html

URL = "https://www.cs.purdue.edu/homes/jind/exchangerate.html"

response = requests.get(URL)
tree = html.fromstring(response.content)

currency_dict = dict()

for row in tree.xpath('//table/tr')[1:]:
    currency, oneUSD, oneUnit = row.xpath('.//td/text()')
    currency_dict[currency] = dict(oneUSD=float(oneUSD), oneUnit=float(oneUnit))

search = "Argentine Peso" ## Change this value to the one you want to search for
oneUnit = currency_dict[search]['oneUnit']
oneUSD = currency_dict[search]['oneUSD']

print "Currency Exchange Rate for: {}".format(search)
print "1 USD = * Unit :  {}".format(oneUSD)
print "1 Unit = * USD :  {}".format(oneUnit)

输出:

Currency Exchange Rate for: Argentine Peso
1 USD = * Unit :  9.44195
1 Unit = * USD :  0.10591

替代 lxml ( http://lxml.de/index.html #documentation )是 BeautifulSoup ( http://www.crummy.com/software/BeautifulSoup/bs4/doc/)

这篇关于我的代码中的列表/字典错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆