csv的美丽汤 [英] beautiful soup to csv

查看:86
本文介绍了csv的美丽汤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一些线程可以获取漂亮的汤数据到csv文件中,但是我找不到适合我的代码的线程.

There are a few threads on getting beautiful soup data to csv files but I can't find one that makes sense with my code.

我正在从《华尔街日报》的大赢家中撤资. 3到103在一天之内给了我该表中排名前100位的股票.

I am scraping from WSJ biggest gainers. 3 to 103 gives me the top 100 stocks from the table in one day.

获取一个单独的单元格上的表中的行的每个值时遇到问题.每行应有6个包含数据的单元格.然后应该转到下一行,再给我下6个数据点(下一个股票).

I am having a problem getting each value of a row on the table on a separate cell. There should be 6 cells per row with data. Then it should go to the next line and give me the next 6 data points (next stock).

每当我使用下面的方法时,它仅在WSJ股票上输出一行,而不是循环多次并每次都转到下一行.我不确定这样做是否使前6个td标签位于第1行,然后接下来的6个td标签位于第2行.

Whenever I use the method below, it only outputs one row on the WSJ stock instead of looping many times and going to the next row each time. I'm not sure to make it so that the first 6 td tags are in row 1, then the next 6 td tags are in row 2.

我尝试制作一个列表,称为单元格",其中包含对symbol.text的修改,但是没有运气.

I have tried making a list called cells with modifying symbol.text with no luck.

使第一行中的所有tr标记中的所有值都变得更加容易,因为其中有六个,但是它们必须位于自己的单元格中.我也试过循环这也没有运气.

It would be even easier to make the first row all of the values in the first tr tag because there are six of them but they would need to be in their own cell. I have tried looping this with no luck also.

我是Python的新手,所以最简单的代码是最好的.

I am new to Python so the most simple code would be best.

import requests
from bs4 import BeautifulSoup
import csv

urlList = ['http://online.wsj.com/mdc/public/page/2_3021-gainnyse-gainer.html',
       'http://online.wsj.com/mdc/public/page/2_3021-gainnyse-gainer--20150806.html?mod=mdc_pastcalendar',
       'http://online.wsj.com/mdc/public/page/2_3021-gainnyse-gainer--20150805.html?mod=mdc_pastcalendar',
       'http://online.wsj.com/mdc/public/page/2_3021-gainnyse-gainer--20150804.html?mod=mdc_pastcalendar',
       'http://online.wsj.com/mdc/public/page/2_3021-gainnyse-gainer--20150803.html?mod=mdc_pastcalendar']



for i in range(len(urlList)):
    url = urlList[i]            
    r = requests.get(url)
    soup = BeautifulSoup(r.content)         
    scrapeData = soup.select('tr')[3:103]    
    for symbol in scrapeData: 
       print(symbol.text)  



outputFile = open('wsjExample.csv', 'w')          
outputWriter = csv.writer(outputFile)
outputWriter.writerow(['Number', 'Symbol', 'Price', 'Change', '% Change', 'Volume'])
for row in range(len(scrapeData)):
    outputWriter.writerow([symbol('td')[0].text, symbol('td')[1].text, symbol('td')[2].text, symbol('td')[3].text, symbol('td')[4].text, symbol('td')[5].text])
outputFile.close()

谢谢

推荐答案

这是在csv文件中写入的方法:

This is how to write in csv file :

import csv

with open('names.csv', 'w') as csvfile:
    fieldnames = ['first_name', 'last_name']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

writer.writeheader()
writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'})
writer.writerow({'first_name': 'Lovely', 'last_name': 'Spam'})
writer.writerow({'first_name': 'Wonderful', 'last_name': 'Spam'})

这篇关于csv的美丽汤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆