将CSV加载到Pandas MultiIndex DataFrame [英] Load CSV to Pandas MultiIndex DataFrame
本文介绍了将CSV加载到Pandas MultiIndex DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个719mb的CSV档案,如下所示:
从,到,dep,freq,arr,code,mode (标题行)
RGBOXFD,RGBPADTON,127,0,27,99999,2
RGBOXFD,RGBPADTON,127,0,33,99999,2
RGBOXFD,RGBRDLEY,127,0, 1425,99999,2
RGBOXFD,RGBCHOLSEY,127,0,52,99999,2
RGBOXFD,RGBMDNHEAD,127,0,91,99999,2
RGBDIDCOTP,RGBPADTON,127,0 ,46,99999,2
RGBDIDCOTP,RGBPADTON,127,0,3,99999,2
RGBDIDCOTP,RGBCHOLSEY,127,0,61,99999,2
RGBDIDCOTP,RGBRDLEY,127, 0,1430,99999,2
RGBDIDCOTP,RGBPADTON,127,0,115,99999,2
等等...
我想加载到一个pandas DataFrame。现在我知道从csv方法有一个负载:
r = pd.DataFrame.from_csv('test_data2.csv')
但我特别想把它作为一个'MultiIndex'DataFrame加载,其中from和to是索引: / p>
所以结尾是:
dep,freq,arr ,模式
RGBOXFD RGBPADTON 127 0 27 99999 2
RGBRDLEY 127 0 33 99999 2
RGBCHOLSEY 127 0 1425 99999 2
RGBMDNHEAD 127 0 1525 99999 2
等。我不知道该怎么做?
解决方案
您可以使用 pd.read_csv
:
>>> df = pd.read_csv(test_data2.csv,index_col = [0,1],skipinitialspace = True)
>>> df
dep freq arr代码模式
从到
RGBOXFD RGBPADTON 127 0 27 99999 2
RGBPADTON 127 0 33 99999 2
RGBRDLEY 127 0 1425 99999 2
RGBCHOLSEY 127 0 52 99999 2
RGBMDNHEAD 127 0 91 99999 2
RGBDIDCOTP RGBPADTON 127 0 46 99999 2
RGBPADTON 127 0 3 99999 2
RGBCHOLSEY 127 0 61 99999 2
RGBRDLEY 127 0 1430 99999 2
RGBPADTON 127 0 115 99999 2
ve使用 skipinitialspace = True
来摆脱标题行中的那些恼人的空格。
I have a 719mb CSV file that looks like:
from, to, dep, freq, arr, code, mode (header row)
RGBOXFD,RGBPADTON,127,0,27,99999,2
RGBOXFD,RGBPADTON,127,0,33,99999,2
RGBOXFD,RGBRDLEY,127,0,1425,99999,2
RGBOXFD,RGBCHOLSEY,127,0,52,99999,2
RGBOXFD,RGBMDNHEAD,127,0,91,99999,2
RGBDIDCOTP,RGBPADTON,127,0,46,99999,2
RGBDIDCOTP,RGBPADTON,127,0,3,99999,2
RGBDIDCOTP,RGBCHOLSEY,127,0,61,99999,2
RGBDIDCOTP,RGBRDLEY,127,0,1430,99999,2
RGBDIDCOTP,RGBPADTON,127,0,115,99999,2
and so on...
I want to load in to a pandas DataFrame. Now I know there is a load from csv method:
r = pd.DataFrame.from_csv('test_data2.csv')
But I specifically want to load it as a 'MultiIndex' DataFrame where from and to are the indexes:
So ending up with:
dep, freq, arr, code, mode
RGBOXFD RGBPADTON 127 0 27 99999 2
RGBRDLEY 127 0 33 99999 2
RGBCHOLSEY 127 0 1425 99999 2
RGBMDNHEAD 127 0 1525 99999 2
etc. I'm not sure how to do that?
解决方案
You could use pd.read_csv
:
>>> df = pd.read_csv("test_data2.csv", index_col=[0,1], skipinitialspace=True)
>>> df
dep freq arr code mode
from to
RGBOXFD RGBPADTON 127 0 27 99999 2
RGBPADTON 127 0 33 99999 2
RGBRDLEY 127 0 1425 99999 2
RGBCHOLSEY 127 0 52 99999 2
RGBMDNHEAD 127 0 91 99999 2
RGBDIDCOTP RGBPADTON 127 0 46 99999 2
RGBPADTON 127 0 3 99999 2
RGBCHOLSEY 127 0 61 99999 2
RGBRDLEY 127 0 1430 99999 2
RGBPADTON 127 0 115 99999 2
where I've used skipinitialspace=True
to get rid of those annoying spaces in the header row.
这篇关于将CSV加载到Pandas MultiIndex DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文