将标题添加到csv文件 [英] Adding the header to a csv file

查看:198
本文介绍了将标题添加到csv文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个尺寸为100*512的csv文件,我想在spark中对其进行进一步处理.该文件的问题在于它不包含标题,即column names.我需要这些列名称以进一步machine learning中的ETL.我在另一个文件(文本文件)中有列名.我必须将这些列名称作为标题放在上述csv文件中. 例如

I have a csv file with the dimensions 100*512 , I want to process it further in spark. The problem with the file is that it doesn't contain header i.e column names . I need these column names for further ETL in machine learning . I have the column names in another file(text file). I have to put these column names as headers in the csv file mentioned above. e.g.

CSV文件:-

ab 1 23 sf 23 hjh

ab 1 23 sf 23 hjh

hs 6 89 iu 98 adf

hs 6 89 iu 98 adf

gh 7 78 pi 54 ngj

gh 7 78 pi 54 ngj

jh 5 22 kj 78 jdk

jh 5 22 kj 78 jdk

列标题文件:-

一,二,三,四,五,六

one,two,three,four,five, six

我想要这样的输出:-

一二三四五五六

one two three four five six

ab 1 23 sf 23 hjh

ab 1 23 sf 23 hjh

hs 6 89 iu 98 adf

hs 6 89 iu 98 adf

gh 7 78 pi 54 ngj

gh 7 78 pi 54 ngj

jh 5 22 kj 78 jdk

jh 5 22 kj 78 jdk

请提出一些将列标题添加到CSV文件的方法.(而不替换csv文件的行. 我通过将其转换为pandas数据框进行了尝试,但无法获得预期的输出.

Please suggest some method to add the column heads to the CSV file.(Without replacing the row of the csv file. I tried it by converting it to pandas dataframe but can't get the expected output.

推荐答案

首先阅读您的csv文件:

First read your csv file:

from pandas import read_csv      
df = read_csv('test.csv')

如果数据集中有两列(a列和b列),请使用:

If there are two columns in your dataset(column a, and column b) use:

df.columns = ['a', 'b']

将此新数据帧写入csv

Write this new dataframe to csv

df.to_csv('test_2.csv')

这篇关于将标题添加到csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆