美丽的汤刮台 [英] Beautiful Soup Scraping table

查看：49 发布时间：2021/4/15 19:02:57 python beautifulsoup

本文介绍了美丽的汤刮台的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有这小段代码可以从网站上抓取表格数据，然后以csv格式显示.问题是for循环多次打印记录.我不确定是否是由于
标记引起的.顺便说一句，我是Python的新手.感谢您的帮助！

I have this small piece of code to scrape table data from a web site and then display in a csv format. The issue is that for loop is printing the records multiple time . I am not sure if it is due to
tag. btw I am new to Python. Thanks for your help!

#import needed libraries
import urllib
from bs4 import BeautifulSoup
import requests
import pandas as pd
import csv
import sys
import re


# read the data from a URL
url = requests.get("https://www.top500.org/list/2018/06/")

# parse the URL using Beauriful Soup
soup = BeautifulSoup(url.content, 'html.parser')

newtxt= ""
for record in soup.find_all('tr'):
    tbltxt = ""
    for data in record.find_all('td'):
        tbltxt = tbltxt + "," + data.text
        newtxt= newtxt+ "\n" + tbltxt[1:]
        print(newtxt)

推荐答案

from bs4 import BeautifulSoup
import requests

url = requests.get("https://www.top500.org/list/2018/06/")
soup = BeautifulSoup(url.content, 'html.parser')
table = soup.find_all('table', attrs={'class':'table table-condensed table-striped'})
for i in table:
    tr = i.find_all('tr')
    for x in tr:
        print(x.text)

或者是使用熊猫解析表格的最佳方法

Or the best way to parse table using pandas

import pandas as pd
table = pd.read_html('https://www.top500.org/list/2018/06/', attrs={
    'class': 'table table-condensed table-striped'}, header = 1)
print(table)

这篇关于美丽的汤刮台的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

美丽的汤刮台 [英] Beautiful Soup Scraping table

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

美丽的汤刮台 [英] Beautiful Soup Scraping table

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭