读取多个CSV文件并将其水平合并 [英] Reading multiple CSV files and joining them horizontally
问题描述
我有几个csv格式的文件,例如100-age.csv 100-rel.csv 100-gender.csv 101-age.csv ... 101-gender.csv ... 482-rel.csv 482-gender.csv
等.我必须为每个索引(例如100-combo.csv
)创建新文件,该索引将同时连接100-age.csv 100-rel.csv
和100-gender.csv
.我可以使用熊猫对一个文件执行此操作
I have a couple of files in csv format like 100-age.csv 100-rel.csv 100-gender.csv 101-age.csv ... 101-gender.csv ... 482-rel.csv 482-gender.csv
etc. I have to make new file for every index i.e. 100-combo.csv
which which will join 100-age.csv 100-rel.csv
and 100-gender.csv
horizontally. I could do this for one file using pandas
import pandas as pd
age = pd.read_csv('100-age.csv', header=None)
gender = pd.read_csv('100-gender.csv', header=None)
rel = pd.read_csv('100-rel.csv', header=None)
combined = pd.concat([age, gender, rel], axis=1)
combined.to_csv('100-combo.csv', header=None, index=None)
使用linux,有一些像cat
这样的方法只能垂直添加,即彼此堆叠,而paste
命令会干扰我在这些文件中的格式.
Using linux, there are methods like cat
which only add vertically, i.e. stacking on top of each other and paste
command which disturbs the formatting that I have in these files.
def merged_data(i):
age = pd.read_csv(path+str(i)+'.pdf-age.csv', header=None, error_bad_lines=False)
gender = pd.read_csv(path+str(i)+'.pdf-gender.csv', header=None, error_bad_lines=False)
rel = pd.read_csv(path+str(i)+'.pdf-rel.csv', header=None, error_bad_lines=False)
combined = pd.concat([age, gender, rel], axis=1)
combined['block'] = str(i)
combined.to_csv(path+str(i)+'-combo.csv', header=None, index=None)
for num in range(1,483):
merged_data(num)
我收到此错误
EmptyDataError: No columns to parse from file
但是,我知道我所有的数据文件都具有某些或其他值
But, I know that all my data files have some or other values
推荐答案
我做到了,得到了我想要的.我用
I did this and got what I wanted. I used
import pandas as pd
import numpy as np
from pandas.io.common import EmptyDataError
def merged_data(i):
try:
age = pd.read_csv(path+str(i)+'.pdf-age.csv', header=None, error_bad_lines=False, delim_whitespace=True)
except EmptyDataError:
age = pd.DataFrame()
try:
gender = pd.read_csv(path+str(i)+'.pdf-gender.csv', header=None, error_bad_lines=False, delim_whitespace=True)
except EmptyDataError:
gender = pd.DataFrame()
try:
rel = pd.read_csv(path+str(i)+'.pdf-rel.csv', header=None, error_bad_lines=False, delim_whitespace=True)
except EmptyDataError:
rel = pd.DataFrame()
combined = pd.concat([age, gender, rel], axis=1)
combined['block'] = str(i)
combined.to_csv(path+str(i)+'-combo.csv', header=None, index=None)
for num in range(1,483):
merged_data(num)
这篇关于读取多个CSV文件并将其水平合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!