如何将所有这些行写入给定范围的CSV文件? [英] How do I write all of these rows into a CSV file for a given range?

查看:153
本文介绍了如何将所有这些行写入给定范围的CSV文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面的代码的目的是webscrape牛津英语词典在一年的范围内每年发明的词。这一切都按预期。

The purpose of the below code is the webscrape the oxford english dictionary for words that were "invented" in each year within a range of years. This all works as intended.

import csv
import os
import re
import requests
import urllib2

year_start= 1550
year_end = 1552
subject_search = ['Law']

for year in range(year_start, year_end +1):
    path = '/Applications/Python 3.5/Economic'
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
    urllib2.install_opener(opener)

    user_agent = 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
    header = {'User-Agent':user_agent}

    resultPath = os.path.join(path, 'OED_table.csv')
    htmlPath = os.path.join(path, 'OED.html')
    request = urllib2.Request('http://www.oed.com/search?browseType=sortAlpha&case-insensitive=true&dateFilter='+ str(year)+ '&nearDistance=1&ordered=false&page=1&pageSize=100&scope=ENTRY&sort=entry&subjectClass='+ str(subject_search)+ '&type=dictionarysearch', None, header)
    page = opener.open(request)

    with open(resultPath, 'wb') as outputw, open(htmlPath, 'w') as outputh:
        urlpage = page.read()
        outputh.write(urlpage)

        new_words = re.findall(r'<span class=\"hwSect\"><span class=\"hw\">(.*?)</span>', urlpage)
        print new_words
        csv_writer = csv.writer(outputw)
        if csv_writer.writerow([year] + new_words):
            csv_writer.writerow([year, word])

但是,当我实际运行代码时,写入csv文件的唯一部分是我调用的最后一年。所以,我的csv文件最终看起来像一行像这样:

However, when I actually run the code, the only portion that gets written to the csv file is the very last year that I call. So, my csv file ends up looking like a one row like this:

1552,word1,word2,word3等....

1552, word1, word2, word3, etc....

我基本上希望在年份范围内每年有一个单独的行。

I basically want to have a separate row for each year in the range of years. How do I go about this?

推荐答案

你会在循环中覆盖,每次运行代码时,循环和添加 a 而不是 w 代码将添加到现有的数据不覆盖。

You keep overwriting in the loop and every time you run the code, open it once outside the loops and append to the file opening with a instead of w so each run of the code will add to the existing data not overwrite.:

with open("/Applications/Python 3.5/Economic/OED_table.csv", 'a') as outputw, open("/Applications/Python 3.5/Economic/OED.html", 'a') as outputh:     
    for year in range(year_start, year_end +1):
       .....................

这篇关于如何将所有这些行写入给定范围的CSV文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆