用 pandas 编写单个CSV标头 [英] Writing single CSV header with pandas

查看:160
本文介绍了用 pandas 编写单个CSV标头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将数据解析为列表,并使用熊猫进行构图并将其写入CSV文件.首先,我的数据被放入一个集合中,其中 inv 名称 date 都是具有大量条目的列表.然后,我使用 concat 将通过解析的数据集的每次迭代连接成CSV文件,如下所示:

I'm parsing data into lists and using pandas to frame and write to an CSV file. First my data is taken into a set where inv, name, and date are all lists with numerous entries. Then I use concat to concatenate each iteration through the datasets I parse through to a CSV file like so:

counter = True
data = {'Invention': inv, 'Inventor': name, 'Date': date}

if counter is True:
  df = pd.DataFrame(data)
  df = df[['Invetion', 'Inventor', 'Date']]

else:
  df = pd.concat([df, pd.DataFrame(data)])
  df = df[['Invention', 'Inventor', 'Date']]

  with open('./new.csv', 'a', encoding = utf-8) as f:
    if counter is True:
      df.to_csv(f, index = False, header = True)
    else:
      df.to_csv(f, index = False, header = False)

counter = False

counter = True语句位于我要分析的所有数据的迭代循环的外部中,因此不会每次都覆盖.

The counter = True statement resides outside of my iteration loop for all the data I'm parsing so it's not overwriting every time.

所以这意味着它只对我的数据运行一次 ,以获取第一个 df 集合,然后对其进行连接.问题是,即使counter在第一轮中仅为True,并且适用于我的第一个 if语句,但对于我的df文件却不起作用.

So this means it only runs once through my data to grab the first df set then concats it thereafter. The problem is that even though counter is only True the first round and works for my first if-statement for df it does not work for my writing to file.

发生的事情是一次又一次地写头文件-不管计数器仅一次为True的事实.当我将header = False换为counter为True时,它将永远不会写入标头.

What happens is that the header is written over and over again - regardless to the fact that counter is only True once. When I swap the header = False for when counter is True then it never writes the header.

我认为这是由于df以某种方式保留在标头上的串联,但除此之外,我无法找出逻辑错误.

I think this is because of the concatenation of df holding onto the header somehow but other than that I cannot figure out the logic error.

也许我还可以用另一种方式只向同一个CSV文件写入一次标头吗?

Is there perhaps another way I could also write a header once and only once to the same CSV file?

推荐答案

在不查看其余代码的情况下很难分辨出什么地方出了问题.我已经开发了一些有效的测试数据和逻辑.您可以对其进行调整以适合您的需求.

It's hard to tell what might be going wrong without seeing the rest of the code. I've developed some test data and logic that works; you can adapt it to fit your needs.

请尝试以下操作:

import pandas as pd

early_inventions = ['wheel', 'fire', 'bronze']
later_inventions = ['automobile', 'computer', 'rocket']

early_names = ['a', 'b', 'c']
later_names = ['z', 'y', 'x']

early_dates = ['2000-01-01', '2001-10-01', '2002-03-10']
later_dates = ['2010-01-28', '2011-10-10', '2012-12-31']

early_data = {'Invention': early_inventions,
    'Inventor': early_names,
    'Date': early_dates}

later_data = {'Invention': later_inventions,
    'Inventor': later_names,
    'Date': later_dates}

datasets = [early_data, later_data]

columns = ['Invention', 'Inventor', 'Date']
header = True
for dataset in datasets:
    df = pd.DataFrame(dataset)
    df = df[columns]
    mode = 'w' if header else 'a'
    df.to_csv('./new.csv', encoding='utf-8', mode=mode, header=header, index=False)
    header = False

或者,您可以连接循环中的所有数据,并在最后写出数据帧:

Alternatively, you can concatenate all of the data in the loop and write out the dataframe at the end:

df = pd.DataFrame(columns=columns)
for dataset in datasets:
    df = pd.concat([df, pd.DataFrame(dataset)])
    df = df[columns]
df.to_csv('./new.csv', encoding='utf-8', index=False)

如果无法使您的代码符合该API,则可以放弃将标头完全写入to_csv中.您可以检测输出文件是否存在,如果不存在,请首先将标头写入该文件:

If your code cannot be made to conform to this API, you can forego writing the header in to_csv altogether. You can detect whether the output file exists and write the header to it first if it does not:

import os

fn = './new.csv'
if not os.exists(fn):
    with open(fn, mode='w', encoding='utf-8') as f:
        f.write(','.join(columns) + '\n')
# now append the dataframe without a header
df.to_csv(fn, encoding='utf-8', mode='a', header=False, index=False)

这篇关于用 pandas 编写单个CSV标头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆