“列表"对象没有属性“超时",仅打印表中的第一项 [英] 'list' object has no attribute 'timeout' and only prints first item in the table

查看:39
本文介绍了“列表"对象没有属性“超时",仅打印表中的第一项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从URL列表中提取一个表.当我仅输入一个URL时,它仅打印出表中的第一项,而当我向列表中添加更多URL时,我将收到错误消息"list"对象没有属性"timeout".获取其余项目并添加更多URL的最佳方法是什么?下面是我正在运行的代码.

I am trying to pull a table from a list of URL's. When I only input one URL it only prints out the first items in the table and when I add more URL's to the list I get the error message 'list' object has no attribute 'timeout'. What is the best way to get the rest of the items and adding more URL's? Below is the code I am running.

import time, random, csv, bs4, requests, io
import pandas as pd
timeDelay = random.randrange(5, 20)
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_urls = [
"https://www.lonza.com/products-services/bio-research/electrophoresis-of-nucleic-acids-and-proteins/nucleic-acid-electrophoresis/precast-gels-for-dna-and-rna-analysis/truband-gel-anchors.aspx",
"https://www.lonza.com/products-services/bio-research/transfection/nucleofector-kits-for-primary-cells/nucleofector-kits-for-primary-epithelial-cells/nucleofector-kits-for-human-mammary-epithelial-cells-hmec.aspx",
"https://www.lonza.com/products-services/bio-research/transfection/nucleofector-kits-for-primary-cells/nucleofector-kits-for-primary-neural-cells/nucleofector-kits-for-mammalian-glial-cells.aspx",
]
uClient = uReq(my_urls)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")

containers = page_soup.findAll('tbody')


product_name_list =[]
cat_no_list = []
size_list = []
price_list =[]

for container in containers:
    if (len(container) > 0):
    #try:
        title_container = container.findAll('td')
        Product_name = title_container[0].text.strip()
        product_name_list.append(Product_name)

        CatNo_container = container.findAll('td')
        CatNo = CatNo_container[1].text.strip()
        cat_no_list.append(CatNo)

        #Size_container = container.findAll('div',{'class':'col-xs-2 noPadding'})
        #Size = Size_container[0].text.strip()
        #size_list.append(Size)

        Price_container = container.findAll('td')
        Price = Price_container[4].text.strip()
        price_list.append(Price)

        print('Product_name: '+ Product_name)
        print('CatNo: ' + CatNo)
        print('Size: ' + 'N/A')
        print('Price: ' + Price)
        print(" ")
        time.sleep(timeDelay)

推荐答案

您正在此处传递列表,其中 uClient = uReq(my_urls)作为 my_urls ,其中字符串是必需的.
您需要传递列表的各个元素,即字符串.

You are passing a list here, uClient = uReq(my_urls) as my_urls where a string is required.
You need to pass the individual element of the list i.e. the strings.

这是适用于多个网址的经过修改的代码.

Here is the edited code that works for multiple urls.

更新代码(获取所有商品)

UPDATED CODE (to get all items):

import time, random, csv, bs4, requests, io
import pandas as pd
timeDelay = random.randrange(5, 20)
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_urls = [
"https://www.lonza.com/products-services/bio-research/electrophoresis-of-nucleic-acids-and-proteins/nucleic-acid-electrophoresis/precast-gels-for-dna-and-rna-analysis/truband-gel-anchors.aspx",
"https://www.lonza.com/products-services/bio-research/transfection/nucleofector-kits-for-primary-cells/nucleofector-kits-for-primary-epithelial-cells/nucleofector-kits-for-human-mammary-epithelial-cells-hmec.aspx",
"https://www.lonza.com/products-services/bio-research/transfection/nucleofector-kits-for-primary-cells/nucleofector-kits-for-primary-neural-cells/nucleofector-kits-for-mammalian-glial-cells.aspx",
]

for url in my_urls:
    print("URL using: ", url)
    uClient = uReq(url)
    page_html = uClient.read()
    uClient.close()
    page_soup = soup(page_html, "html.parser")

    containers = page_soup.findAll('tbody')


    product_name_list =[]
    cat_no_list = []
    size_list = []
    price_list =[]

    for container in containers:
        if (len(container) > 0):
        #try:
            items = container.findAll('tr')
            for item in items:
                item = item.text.split('\n')

                Product_name = item[1]
                product_name_list.append(Product_name)

                CatNo = item[2]
                cat_no_list.append(CatNo)

                #Size_container = container.findAll('div',{'class':'col-xs-2 noPadding'})
                #Size = Size_container[0].text.strip()
                #size_list.append(Size)

                Price = item[6]
                price_list.append(Price)

                print('Product_name: '+ Product_name)
                print('CatNo: ' + CatNo)
                print('Size: ' + 'N/A')
                print('Price: ' + Price)
                print(" ")
            time.sleep(timeDelay)

这篇关于“列表"对象没有属性“超时",仅打印表中的第一项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆